Gemini Robotics-ER 1.5

Gemini Robotics-ER 1.5 is a vision-language model (VLM) that brings Gemini's agentic capabilities to robotics. It's designed for advanced reasoning in the physical world, allowing robots to interpret complex visual data, perform spatial reasoning, and plan actions from natural language commands.

Documentation

Visit the Robotics page for full coverage of features and capabilities.

gemini-robotics-er-1.5-preview

Property Description
Model code gemini-robotics-er-1.5-preview
Supported data types

Inputs

Text, images, video, audio

Output

Text

Token limits[*]

Input token limit

1,048,576

Output token limit

65,536

Capabilities

Audio generation

Not supported

Batch API

Not supported

Caching

Not supported

Code execution

Supported

Function calling

Supported

Grounding with Google Maps

Not supported

Image generation

Not supported

Live API

Not supported

Search grounding

Supported

Structured outputs

Supported

Thinking

Supported

URL context

Supported

Versions
Read the model version patterns for more details.
  • Preview: gemini-robotics-er-1.5-preview
Latest update September 2025
Knowledge cutoff January 2025