The Interactions API is now generally available. We recommend using this API for access to all the latest features and models.

Models

Gemini 3

Stable

Gemini 3.5 Flash

Most intelligent model for sustained frontier performance on agentic and coding tasks.

Stable

Gemini 3.1 Flash-Lite

Frontier-class performance rivaling larger models at a fraction of the cost.

Stable

Nano Banana 2

Powerful, high-efficiency image generation and editing, optimized for speed and high-volume use cases.

Stable

Nano Banana 2 Lite

Ultra-low latency and cost-effective image generation and editing, designed for high-volume interactive use cases.

Stable

Nano Banana Pro

State-of-the-art image generation and editing models for highly contextual native image creation.

Stable

Preview

Gemini 3.1 Pro

Advanced intelligence, complex problem-solving skills, and powerful agentic and vibe coding capabilities.

Preview

Gemini 3 Flash

Frontier-class performance rivaling larger models at a fraction of the cost.

Preview

Gemini 3.5 Live Translate

Low-latency, real-time speech to speech translation model that supports 70+ languages.

New Preview

Gemini 3.1 Flash Live

High-quality, low-latency Live API model for real-time dialogue and voice-first AI applications.

New Preview

Gemini 3.1 Flash TTS

Powerful, low-latency speech generation.

New Preview

Gemini Omni Flash

Fast, conversational video generation and editing. Turn text and images into video, and refine results through natural language.

New Preview

Gemini 2.5 Flash

Our best price-performance model for low-latency, high-volume tasks that require reasoning.

Nano Banana

State-of-the-art native image generation and editing designed for fast, creative workflows.

Gemini 2.5 Flash Live Preview

Optimized for real-time conversational agents with sub-second native audio streaming.

Gemini 2.5 Flash TTS Preview

Controllable text-to-speech audio generation with fine control over style and pacing.

Gemini 2.5 Flash-Lite

The fastest and most budget-friendly multimodal model in the 2.5 family.

Gemini 2.5 Pro

Our most advanced model for complex tasks, featuring deep reasoning and coding capabilities.

Gemini 2.5 Pro TTS Preview

High-fidelity speech synthesis optimized for quality in structured workflows like podcasts and audiobooks.

Audio models

This section contains all audio models, including ones that may already be listed in other sections

Gemini 3.1 Flash Live Preview

Our high-quality, low-latency audio-to-audio (A2A) model designed for real-time dialogue and voice-first AI applications.

Gemini 3.1 Flash TTS Preview

Powerful, low-latency speech generation, with natural outputs, steerable prompts, and new expressive audio tags for precise narration control.

Gemini 2.5 Flash Live Preview

Our flagship Live API model for low-latency, bidirectional voice and video agents with native audio reasoning.

Gemini 2.5 Flash TTS Preview

Fast and controllable text-to-speech for low-latency, cost-efficient applications and real-time assistants.

Gemini 2.5 Pro TTS Preview

High-fidelity speech synthesis optimized for quality in structured workflows like podcasts and audiobooks.

Generative media models

This section contains all generative media models, including ones that may already be listed in other sections

Nano Banana 2

High-efficiency production-scale visual creation, combining the intelligence of the Gemini 3 series with lightning-fast generation speeds.

Nano Banana 2 Lite

Designed as the efficiency specialist of the image generation family, offering ultra-low latency and cost-effective image generation and editing.

Veo 3.1 Preview

State-of-the-art cinematic video generation with advanced creative controls and natively synchronized audio.

Nano Banana Pro

A professional design engine with a reasoning core for studio-quality 4K visuals, complex layouts, and precise text rendering.

Veo 3.1 Lite Preview

High-efficiency, low-cost, developer-first video generation, editing, and cinematic control from the Veo 3.1 family.

Gemini Omni Flash Preview

Fast, conversational video generation and editing. Turn text and images into video, and refine results through natural language.

Nano Banana

State-of-the-art native image generation and editing designed for fast, creative workflows.

Imagen 4 Deprecated

Text-to-image model yet, featuring fast and ultra-fast generation and exceptional clarity up to 2K resolution.

Music generation models

This section contains all music generation models, including ones that may already be listed in other sections

Lyria 3 Pro Preview

Our flagship music generation model, optimized for full-length songs with complex structural coherence.

Lyria 3 Clip Preview

Optimized for generating short musical clips, loops, and previews up to 30 seconds.

Lyria RealTime Experimental

High-fidelity music generation model providing granular creative control and real-time streaming capabilities.

Tool and agent models

Computer Use Preview

A specialized model that can "see" a digital screen and perform UI actions like clicking, typing, and navigating to automate complex browser tasks.

Gemini Deep Research Preview

An agentic model that autonomously plans and executes multi-step research across hundreds of sources to produce cited, interactive reports.

Gemini Deep Research Max Preview

Maximum comprehensiveness for automated context gathering and synthesis across hundreds of sources.

Antigravity Agent Preview

A general-purpose managed agent that autonomously plans, reasons, runs code, manages files, and browses the web inside a secure, isolated Linux sandbox.

Specialized task models

Gemini Embedding 2

Our first multimodal embedding model, mapping text, images, video, audio, and PDFs into a unified embedding space for advanced semantic search and RAG systems.

Gemini Embedding

High-dimensional vector representations for advanced semantic search, text classification, and RAG systems.

Advanced embodied reasoning model that understands physical spaces and plans multi-step tasks for robotic agents with new capabilities like instrument reading, improved spatial and physical reasoning.

Previous models

Gemini 2.0 Flash Shut down

Our second generation workhorse model, with next-gen features and improved capabilities, including superior speed, native tool use, and a 1M token context window.

Gemini 2.0 Flash-Lite Shut down

Our fastest second generation model, optimized for cost efficiency and low latency.

Gemini 3.1 Flash-Lite Preview Shut down

Our most cost-efficient multimodal model, offering the fastest performance for high-frequency, lightweight tasks.

Gemini 3 Pro Preview Shut down

Our state-of-the-art reasoning model, with advanced multimodal understanding.

Model version name patterns

Gemini models are available in either stable, preview, latest, or experimental versions.

Stable

Points to a specific stable model. Stable models usually don't change. Most production apps should use a specific stable model.

For example: gemini-3.5-flash.

Preview

Points to a preview model which may be used for production. Preview models will typically have billing enabled, might come with more restrictive rate limits and will be deprecated with at least 2 weeks notice.

For example: gemini-2.5-flash-preview-09-2025.

Latest

Points to the latest release for a specific model variation. This can be a stable, preview or experimental release. This alias will get hot-swapped with every new release of a specific model variation. A 2-week notice will be provided through email before the version behind latest is changed.

For example: gemini-flash-latest.

Experimental

Points to an experimental model which will typically be not be suitable for production use and come with more restrictive rate limits. We release experimental models to gather feedback and get our latest updates into the hands of developers quickly.

Experimental models are not stable and availability of model endpoints is subject to change.

Model deprecations

For information about model deprecations, visit the Gemini deprecations page.