The Gemini 3.1 Flash TTS Preview model provides powerful, low-latency speech generation with natural outputs, steerable prompts, and new expressive audio tags for precise narration control.
Documentation
The Gemini 3.1 Flash TTS Preview model introduces expressive audio tags for controlling narration, as well as overall improvements to naturalness, controllability, and multilinguality.
Visit the Text-to-Speech guide for full coverage of features and capabilities.
gemini-3.1-flash-tts-preview
| Property | Description |
|---|---|
| Model code | gemini-3.1-flash-tts-preview |
| Supported data types |
Inputs Text Output Audio |
| Token limits[*] |
Input token limit 8,192 Output token limit 16,384 |
| Capabilities |
Audio generation Supported Batch API Supported Caching Not supported Code execution Not supported File search Not Supported Function calling Not supported Grounding with Google Maps Not supported Image generation Not supported Live API Not supported Search grounding Not supported Structured outputs Not supported Thinking Not supported URL context Not supported |
| Versions |
|
| Latest update | April 2026 |
| Knowledge cutoff | January 2025 |