Gemini 2.0 Flash Thinking Mode

Gemini 2.0 Flash Thinking Mode is an experimental model that's trained to generate the "thinking process" the model goes through as part of its response. As a result, Thinking Mode is capable of stronger reasoning capabilities in its responses than the base Gemini 2.0 Flash model.

Use Thinking Mode

Thinking Mode is available as an experimental model in Google AI Studio, and for direct use in the Gemini API:

Gemini API

Specify the model code when you make a call to the Gemini API. For example:

response = client.models.generate_content(
    model='gemini-2.0-flash-thinking-exp', contents='Explain the Pythagorean theorem to a 10-year-old.'
)

You can use either gemini-2.0-flash-thinking-exp or gemini-2.0-flash-thinking-exp-1219 as the model code.

Google AI Studio

Select the Gemini 2.0 Flash Thinking Experimental model in the Model drop-down menu in the Settings pane.

Thoughts

The way the model's thoughts are returned depends on whether you're using the Gemini API directly or making a request through Google AI Studio.

Gemini API

The model's thinking process is returned as the first element of the content.parts list that is created when the model generates the response. For example, the following code prints out only the model's thinking process:

response = client.models.generate_content(
    model='gemini-2.0-flash-thinking-exp', contents='Solve 3*x^3-5*x=1'
)

Markdown(response.candidates[0].content.parts[0].text)

You can see more examples of how to use Thinking Mode using the Gemini API in our Colab notebook.

Google AI Studio

The model's thinking process is returned as a new section in the Thoughts panel in the response window.

Example Thoughts panel in Google AI
Studio

By default, the Thoughts panel is collapsed. You can expand the panel by clicking the Thoughts header.

Unlike the returned response, the contents of the Thoughts panel are not editable in Google AI Studio.

Limitations

Thinking Mode is an experimental model and has the following limitations:

  • 32k token input limit
  • Text and image input only
  • 8k token output limit
  • Text only output
  • No built-in tool usage like Search or code execution

What's next?

Try Thinking Mode for yourself with our Colab notebook, or open Google AI Studio and try prompting the model for yourself.