The Interactions API is now generally available. We recommend using this API for access to all the latest features and models.

Gemini 2.5 Flash Text-to-Speech

Our fastest engine for high-fidelity speech synthesis, offering low-latency and cost-efficient audio generation. Gemini 2.5 Flash TTS is best for real-time assistants, high-volume narration, and conversational use cases that require fine-grained control over voice style and pacing.

Try in Google AI Studio

Documentation

Visit the Text-to-Speech guide for full coverage of features and capabilities.

gemini-2.5-flash-preview-tts

Property	Description
Model code	`gemini-2.5-flash-preview-tts`
Supported data types	Inputs Text Output Audio
Token limits^[*]	Input token limit 8,192 Output token limit 16,384
Capabilities	Audio generation Supported Caching Not supported Code execution Not supported File search Not Supported Function calling Not supported Grounding with Google Maps Not supported Image generation Not supported Live API Not supported Search grounding Not supported Structured outputs Not supported Thinking Not supported URL context Not supported
Consumption options	Batch API Supported Flex inference Not supported Priority inference Not supported
Versions	Read the model version patterns for more details. `gemini-2.5-flash-preview-tts`
Latest update	December 2025