Gemini 思考
Gemini 3 和 2.5 系列模型使用 “思考过程”,这显著提高了它们的推理和多步 规划能力,使其在编码、高等数学和数据分析等复杂任务中非常有效。
当您使用思考模型时,Gemini 会在内部进行推理,然后再给出回答。Interactions API 通过 thought 步骤展示此推理过程,这些专用步骤按时间顺序显示在 steps 数组中,与函数调用、用户输入或模型输出并列。
每个思考步骤都包含两个字段:
| 字段 | 必填 | 说明 |
|---|---|---|
signature |
✅ 是 | 模型内部推理状态的加密表示形式。始终存在,即使模型执行的推理最少也是如此。 |
summary |
❌ 否 | 一个内容数组(文本和/或图片),用于总结推理过程。可能会为空,具体取决于 thinking_summaries 配置、模型是否执行了足够的推理,或内容类型(例如,图片潜在空间可能没有文本摘要)。 |
与思考互动
与思考模型发起互动与其他任何互动请求类似。在 model 字段中指定一个支持思考的 模型:
Python
from google import genai
client = genai.Client()
interaction = client.interactions.create(
model="gemini-3-flash-preview",
input="Explain the concept of Occam's Razor and provide a simple, everyday example."
)
print(interaction.steps[-1].content[0].text)
JavaScript
import { GoogleGenAI } from "@google/genai";
const client = new GoogleGenAI({});
const interaction = await client.interactions.create({
model: "gemini-3-flash-preview",
input: "Explain the concept of Occam's Razor and provide a simple, everyday example."
});
console.log(interaction.steps.at(-1).content[0].text);
REST
curl -X POST "https://generativelanguage.googleapis.com/v1beta/interactions" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-H "Api-Revision: 2026-05-20" \
-H 'Content-Type: application/json' \
-d '{
"model": "gemini-3-flash-preview",
"input": "Explain the concept of Occam'\''s Razor and provide a simple example."
}'
思考总结
思考总结可让您深入了解模型的内部推理过程。
默认情况下,系统只会返回最终输出。您可以使用 thinking_summaries 启用思考总结:
Python
from google import genai
client = genai.Client()
interaction = client.interactions.create(
model="gemini-3-flash-preview",
input="What is the sum of the first 50 prime numbers?",
generation_config={
"thinking_summaries": "auto"
}
)
for step in interaction.steps:
if step.type == "thought":
print("Thought summary:")
for content_block in step.summary:
if content_block.type == "text":
print(content_block.text)
print()
elif step.type == "model_output":
for content_block in step.content:
if content_block.type == "text":
print("Answer:")
print(content_block.text)
print()
JavaScript
import { GoogleGenAI } from "@google/genai";
const client = new GoogleGenAI({});
const interaction = await client.interactions.create({
model: "gemini-3-flash-preview",
input: "What is the sum of the first 50 prime numbers?",
generation_config: {
thinking_summaries: "auto"
}
});
for (const step of interaction.steps) {
if (step.type === "thought") {
console.log("Thought summary:");
for (const contentBlock of step.summary) {
if (contentBlock.type === "text") console.log(contentBlock.text);
}
} else if (step.type === "model_output") {
for (const contentBlock of step.content) {
if (contentBlock.type === "text") {
console.log("Answer:");
console.log(contentBlock.text);
}
}
}
}
REST
curl -X POST "https://generativelanguage.googleapis.com/v1beta/interactions" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-H "Api-Revision: 2026-05-20" \
-H 'Content-Type: application/json' \
-d '{
"model": "gemini-3-flash-preview",
"input": "What is the sum of the first 50 prime numbers?",
"generation_config": {
"thinking_summaries": "auto"
}
}'
在以下情况下,思考块可能仅包含签名,而不包含总结 :
- 简单请求,模型没有进行足够的推理来生成总结
thinking_summaries: "none",其中明确停用了总结- 某些思考内容类型(例如图片)可能没有文本总结
您的代码应始终处理 summary 为空或缺失的思考块。
流式传输思考
使用流式传输在生成期间接收增量思考总结。 思考块使用服务器发送的事件 (SSE) 传送,包含两种不同的增量类型:
| 增量类型 | 包含 | 何时发送 |
|---|---|---|
thought_summary |
文本或图片总结内容 | 一个或多个包含增量总结的增量 |
thought_signature |
加密签名 | step.stop 之前的最后一个增量 |
Python
from google import genai
client = genai.Client()
prompt = """
Alice, Bob, and Carol each live in a different house on the same street: red, green, and blue.
Alice does not live in the red house.
Bob does not live in the green house.
Carol does not live in the red or green house.
Which house does each person live in?
"""
thoughts = ""
answer = ""
stream = client.interactions.create(
model="gemini-3-flash-preview",
input=prompt,
generation_config={
"thinking_summaries": "auto"
},
stream=True
)
for event in stream:
if event.event_type == "step.delta":
if event.delta.type == "thought_summary":
if not thoughts:
print("Thinking...")
summary_text = event.delta.content.get('text', '') if hasattr(event.delta, 'content') else getattr(event.delta, 'text', '')
print(f"[Thought] {summary_text}", end="")
thoughts += summary_text
elif event.delta.type == "text" and event.delta.text:
if not answer:
print("\nAnswer:")
print(event.delta.text, end="")
answer += event.delta.text
JavaScript
import { GoogleGenAI } from "@google/genai";
const client = new GoogleGenAI({});
const prompt = `Alice, Bob, and Carol each live in a different house on the same
street: red, green, and blue. Alice does not live in the red house.
Bob does not live in the green house.
Carol does not live in the red or green house.
Which house does each person live in?`;
let thoughts = "";
let answer = "";
const stream = await client.interactions.create({
model: "gemini-3-flash-preview",
input: prompt,
generation_config: {
thinking_summaries: "auto"
},
stream: true
});
for await (const event of stream) {
if (event.event_type === "step.delta") {
if (event.delta.type === "thought_summary") {
if (!thoughts) console.log("Thinking...");
const text = event.delta.content?.text || "";
process.stdout.write(`[Thought] ${text}`);
thoughts += text;
} else if (event.delta.type === "text" && event.delta.text) {
if (!answer) console.log("\nAnswer:");
process.stdout.write(event.delta.text);
answer += event.delta.text;
}
}
}
REST
curl -X POST "https://generativelanguage.googleapis.com/v1beta/interactions" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-H "Api-Revision: 2026-05-20" \
-H 'Content-Type: application/json' \
--no-buffer \
-d '{
"model": "gemini-3-flash-preview",
"input": "Alice, Bob, and Carol each live in a different house on the same street: red, green, and blue. Alice does not live in the red house. Bob does not live in the green house. Carol does not live in the red or green house. Which house does each person live in?",
"generation_config": {
"thinking_summaries": "auto"
},
"stream": true
}'
流式传输回答使用服务器发送的事件 (SSE),由步骤和事件组成。请看下面的示例。
event: interaction.created
data: {"interaction":{"id":"v1_xxx","status":"in_progress","object":"interaction","model":"gemini-3-flash-preview"},"event_type":"interaction.created"}
event: step.start
data: {"index":0,"step":{"signature":"","summary":[{"text":"**Evaluating the clues**\n\nI'm considering...","type":"text"}],"type":"thought"},"event_type":"step.start"}
event: step.delta
data: {"index":0,"delta":{"signature":"EpoGCpcGAXLI2nx/...","type":"thought_signature"},"event_type":"step.delta"}
event: step.stop
data: {"index":0,"event_type":"step.stop"}
event: step.start
data: {"index":1,"step":{"content":[{"text":"Based on the clues provided, here","type":"text"}],"type":"model_output"},"event_type":"step.start"}
event: step.delta
data: {"index":1,"delta":{"text":" is the answer to your question...","type":"text"},"event_type":"step.delta"}
event: step.stop
data: {"index":1,"event_type":"step.stop"}
event: interaction.completed
data: {"interaction":{"id":"v1_xxx","status":"completed","usage":{"total_tokens":530,"total_input_tokens":62,"total_output_tokens":171,"total_thought_tokens":297}},"event_type":"interaction.completed"}
event: done
data: [DONE]
控制思考
Gemini 模型默认采用动态思考,根据请求的复杂程度自动调整推理工作量。您可以使用 thinking_level 参数控制此行为。
| 模型 | 默认思考 | 支持的级别 |
|---|---|---|
| gemini-3.1-pro-preview | 开启(高) | 低、中、高 |
| gemini-3-flash-preview | 开启(高) | 最低、低、中、高 |
| gemini-3-pro-preview | 开启(高) | 低、高 |
| gemini-2.5-pro | 开启 | 低、中、高 |
| gemini-2.5-flash | 开启 | 低、中、高 |
| gemini-2.5-flash-lite | 关闭 | 低、中、高 |
Python
from google import genai
client = genai.Client()
interaction = client.interactions.create(
model="gemini-3-flash-preview",
input="Provide a list of 3 famous physicists and their key contributions",
generation_config={
"thinking_level": "low"
}
)
print(interaction.steps[-1].content[0].text)
JavaScript
import { GoogleGenAI } from "@google/genai";
const client = new GoogleGenAI({});
const interaction = await client.interactions.create({
model: "gemini-3-flash-preview",
input: "Provide a list of 3 famous physicists and their key contributions",
generation_config: {
thinking_level: "low"
}
});
console.log(interaction.steps.at(-1).content[0].text);
REST
curl -X POST "https://generativelanguage.googleapis.com/v1beta/interactions" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-H "Api-Revision: 2026-05-20" \
-H 'Content-Type: application/json' \
-d '{
"model": "gemini-3-flash-preview",
"input": "Provide a list of 3 famous physicists and their key contributions",
"generation_config": {
"thinking_level": "low"
}
}'
思考签名
思考签名是模型内部推理的加密表示形式。在多轮互动中,需要使用它们来保持推理的连续性。
与 generateContent API 相比,Interactions API 可以更轻松地处理思考签名。
有状态模式(推荐)
默认情况下,当您在有状态模式下使用 Interactions API(通过设置 store: true 并在后续轮次中传递 previous_interaction_id)时,服务器会自动管理对话状态,包括所有思考块和签名。在此模式下,您无需执行任何与签名相关的操作。它们完全在服务器端处理。
无状态模式
如果您自行管理对话状态(无状态模式),并在每个请求中传递完整的输入和输出历史记录:
- 您必须 始终重新发送从模型收到的所有
thought块,且内容完全相同。 - 您不应 从历史记录中移除或修改思考块,因为它们包含模型继续推理所需的签名。
- 在会话中切换模型时,您仍应重新发送之前模型的思考块。后端会管理兼容性。
价格
开启思考后,回答价格是输出 token 和思考 token 的总和。您可以从 total_thought_tokens 字段获取生成的思考 token 总数。
Python
# ...
print("Thoughts tokens:", interaction.usage.total_thought_tokens)
print("Output tokens:", interaction.usage.total_output_tokens)
JavaScript
// ...
console.log(`Thoughts tokens: ${interaction.usage.total_thought_tokens}`);
console.log(`Output tokens: ${interaction.usage.total_output_tokens}`);
思考模型会生成完整的思考,以提高最终回答的质量,然后输出总结,让您深入了解思考过程。价格基于模型需要生成的完整思考 token,即使 API 只输出总结也是如此。
如需详细了解 token,请参阅 token 计数 指南。
最佳实践
请遵循以下准则,高效使用思考模型。
- 查看推理:分析思考总结,了解失败原因并改进提示。
- 控制思考预算:提示模型减少思考,以生成较长的输出,从而节省 token。
- 简单任务:使用最少的思考来检索事实或进行分类(例如,“DeepMind 是在哪里成立的?”)。
- 中等任务:使用默认思考来比较概念或进行创意推理(例如,比较电动汽车和混合动力汽车)。
- 复杂任务:使用最大思考来执行高级编码、数学运算或多步规划(例如,解决 AIME 数学问题)。
后续步骤
- 文本生成:基本文本回答
- 函数调用:连接到工具
- Gemini 3 指南:特定于模型的功能