Our first multimodal embedding model, providing efficient numerical mapping of text, images, video, audio, and PDFs into a single unified embedding space. The Gemini Embedding 2 model is best for cross-modal semantic search, document retrieval, and recommendation systems that require fast, scalable similarity calculations across large multimodal datasets.
Documentation
Visit the Embeddings page for full coverage of features and capabilities.
gemini-embedding-2-preview
| Property | Description |
|---|---|
| Model code |
Gemini API
|
| Supported data types |
Input Text, image, video, audio, PDF Output Text embeddings |
| Token limits[*] |
Input token limit 8,192 Output dimension size Flexible, supports: 128 - 3072, Recommended: 768, 1536, 3072 |
| Versions |
|
| Latest update | March 2026 |