Gemini Robotics-ER 1.5 is a vision-language model (VLM) that brings Gemini's agentic capabilities to robotics. It's designed for advanced reasoning in the physical world, allowing robots to interpret complex visual data, perform spatial reasoning, and plan actions from natural language commands.
Documentation
Visit the Robotics page for full coverage of features and capabilities.
gemini-robotics-er-1.5-preview
| Property | Description |
|---|---|
| Model code | gemini-robotics-er-1.5-preview |
| Supported data types |
Inputs Text, images, video, audio Output Text |
| Token limits[*] |
Input token limit 1,048,576 Output token limit 65,536 |
| Capabilities |
Audio generation Not supported Batch API Not supported Caching Not supported Code execution Supported Function calling Supported Grounding with Google Maps Not supported Image generation Not supported Live API Not supported Search grounding Supported Structured outputs Supported Thinking Supported URL context Supported |
| Versions |
|
| Latest update | September 2025 |
| Knowledge cutoff | January 2025 |