The Interactions API is now generally available. We recommend using this API for access to all the latest features and models.

Gemini Robotics-ER 1.5

Gemini Robotics-ER 1.5 is a vision-language model (VLM) that brings Gemini's agentic capabilities to robotics. It's designed for advanced reasoning in the physical world, allowing robots to interpret complex visual data, perform spatial reasoning, and plan actions from natural language commands.

Try in Google AI Studio

Documentation

Visit the Robotics page for full coverage of features and capabilities.

gemini-robotics-er-1.5-preview

Property	Description
Model code	`gemini-robotics-er-1.5-preview`
Supported data types	Inputs Text, images, video, audio Output Text
Token limits^[*]	Input token limit 1,048,576 Output token limit 65,536
Capabilities	Audio generation Not supported Caching Not supported Code execution Supported Function calling Supported Grounding with Google Maps Not supported Image generation Not supported Live API Not supported Search grounding Supported Structured outputs Supported Thinking Supported URL context Supported
Consumption options	Batch API Not supported Flex inference Not supported Priority inference Not supported
Versions	Read the model version patterns for more details. Preview: `gemini-robotics-er-1.5-preview`
Latest update	September 2025
Knowledge cutoff	January 2025