Gemini Deep Research 智能体可自主规划、执行和整合多步研究任务。它由 Gemini 3 Pro 提供支持,可使用网络搜索和您自己的数据来浏览复杂的信息环境,从而生成详细且带有引用信息的报告。
研究任务涉及迭代搜索和阅读,可能需要几分钟才能完成。您必须使用后台执行(设置 background=true)来异步运行代理并轮询结果。如需了解详情,请参阅处理长时间运行的任务。
以下示例展示了如何在后台启动研究任务并轮询结果。
Python
import time
from google import genai
client = genai.Client()
interaction = client.interactions.create(
input="Research the history of Google TPUs.",
agent='deep-research-pro-preview-12-2025',
background=True
)
print(f"Research started: {interaction.id}")
while True:
interaction = client.interactions.get(interaction.id)
if interaction.status == "completed":
print(interaction.outputs[-1].text)
break
elif interaction.status == "failed":
print(f"Research failed: {interaction.error}")
break
time.sleep(10)
JavaScript
import { GoogleGenAI } from '@google/genai';
const client = new GoogleGenAI({});
const interaction = await client.interactions.create({
input: 'Research the history of Google TPUs.',
agent: 'deep-research-pro-preview-12-2025',
background: true
});
console.log(`Research started: ${interaction.id}`);
while (true) {
const result = await client.interactions.get(interaction.id);
if (result.status === 'completed') {
console.log(result.outputs[result.outputs.length - 1].text);
break;
} else if (result.status === 'failed') {
console.log(`Research failed: ${result.error}`);
break;
}
await new Promise(resolve => setTimeout(resolve, 10000));
}
REST
# 1. Start the research task
curl -X POST "https://generativelanguage.googleapis.com/v1beta/interactions" \
-H "Content-Type: application/json" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-d '{
"input": "Research the history of Google TPUs.",
"agent": "deep-research-pro-preview-12-2025",
"background": true
}'
# 2. Poll for results (Replace INTERACTION_ID)
# curl -X GET "https://generativelanguage.googleapis.com/v1beta/interactions/INTERACTION_ID" \
# -H "x-goog-api-key: $GEMINI_API_KEY"
使用自己的数据进行研究
Deep Research 可以访问各种工具。默认情况下,代理可以使用 google_search 和 url_context 工具访问公共互联网上的信息。默认情况下,您无需指定这些工具。不过,如果您还想让代理使用文件搜索工具访问您自己的数据,则需要按照以下示例所示添加该工具。
Python
import time
from google import genai
client = genai.Client()
interaction = client.interactions.create(
input="Compare our 2025 fiscal year report against current public web news.",
agent="deep-research-pro-preview-12-2025",
background=True,
tools=[
{
"type": "file_search",
"file_search_store_names": ['fileSearchStores/my-store-name']
}
]
)
JavaScript
const interaction = await client.interactions.create({
input: 'Compare our 2025 fiscal year report against current public web news.',
agent: 'deep-research-pro-preview-12-2025',
background: true,
tools: [
{ type: 'file_search', file_search_store_names: ['fileSearchStores/my-store-name'] },
]
});
REST
curl -X POST "https://generativelanguage.googleapis.com/v1beta/interactions" \
-H "Content-Type: application/json" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-d '{
"input": "Compare our 2025 fiscal year report against current public web news.",
"agent": "deep-research-pro-preview-12-2025",
"background": true,
"tools": [
{"type": "file_search", "file_search_store_names": ["fileSearchStores/my-store-name"]},
]
}'
可操纵性和格式
您可以在提示中提供具体的格式设置说明,从而引导代理的输出。这样,您就可以将报告划分为特定部分和子部分,添加数据表,或针对不同的受众群体调整语气(例如,“技术”“高管”“休闲”)。
在输入文本中明确定义所需的输出格式。
Python
prompt = """
Research the competitive landscape of EV batteries.
Format the output as a technical report with the following structure:
1. Executive Summary
2. Key Players (Must include a data table comparing capacity and chemistry)
3. Supply Chain Risks
"""
interaction = client.interactions.create(
input=prompt,
agent="deep-research-pro-preview-12-2025",
background=True
)
JavaScript
const prompt = `
Research the competitive landscape of EV batteries.
Format the output as a technical report with the following structure:
1. Executive Summary
2. Key Players (Must include a data table comparing capacity and chemistry)
3. Supply Chain Risks
`;
const interaction = await client.interactions.create({
input: prompt,
agent: 'deep-research-pro-preview-12-2025',
background: true,
});
REST
curl -X POST "https://generativelanguage.googleapis.com/v1beta/interactions" \
-H "Content-Type: application/json" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-d '{
"input": "Research the competitive landscape of EV batteries.\n\nFormat the output as a technical report with the following structure: \n1. Executive Summary\n2. Key Players (Must include a data table comparing capacity and chemistry)\n3. Supply Chain Risks",
"agent": "deep-research-pro-preview-12-2025",
"background": true
多模态输入
深度研究支持多模态输入,包括图片、PDF、音频和视频,使代理能够分析丰富的内容,然后根据提供的输入进行基于网络的上下文相关研究。例如,您可以提供一张照片,然后让智能体识别照片中的对象、研究其行为或查找相关信息。
以下示例演示了如何使用图片网址发出图片分析请求。
Python
import time
from google import genai
client = genai.Client()
prompt = '''Analyze the interspecies dynamics and behavioral risks present
in the provided image of the African watering hole. Specifically, investigate
the symbiotic relationship between the avian species and the pachyderms
shown, and conduct a risk assessment for the reticulated giraffes based on
their drinking posture relative to the specific predator visible in the
foreground.'''
interaction = client.interactions.create(
input=[
{"type": "text", "text": prompt},
{
"type": "image",
"uri": "https://storage.googleapis.com/generativeai-downloads/images/generated_elephants_giraffes_zebras_sunset.jpg"
}
],
agent="deep-research-pro-preview-12-2025",
background=True
)
print(f"Research started: {interaction.id}")
while True:
interaction = client.interactions.get(interaction.id)
if interaction.status == "completed":
print(interaction.outputs[-1].text)
break
elif interaction.status == "failed":
print(f"Research failed: {interaction.error}")
break
time.sleep(10)
JavaScript
import { GoogleGenAI } from '@google/genai';
const client = new GoogleGenAI({});
const prompt = `Analyze the interspecies dynamics and behavioral risks present
in the provided image of the African watering hole. Specifically, investigate
the symbiotic relationship between the avian species and the pachyderms
shown, and conduct a risk assessment for the reticulated giraffes based on
their drinking posture relative to the specific predator visible in the
foreground.`;
const interaction = await client.interactions.create({
input: [
{ type: 'text', text: prompt },
{
type: 'image',
uri: 'https://storage.googleapis.com/generativeai-downloads/images/generated_elephants_giraffes_zebras_sunset.jpg'
}
],
agent: 'deep-research-pro-preview-12-2025',
background: true
});
console.log(`Research started: ${interaction.id}`);
while (true) {
const result = await client.interactions.get(interaction.id);
if (result.status === 'completed') {
console.log(result.outputs[result.outputs.length - 1].text);
break;
} else if (result.status === 'failed') {
console.log(`Research failed: ${result.error}`);
break;
}
await new Promise(resolve => setTimeout(resolve, 10000));
}
REST
# 1. Start the research task with image input
curl -X POST "https://generativelanguage.googleapis.com/v1beta/interactions" \
-H "Content-Type: application/json" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-d '{
"input": [
{"type": "text", "text": "Analyze the interspecies dynamics and behavioral risks present in the provided image of the African watering hole. Specifically, investigate the symbiotic relationship between the avian species and the pachyderms shown, and conduct a risk assessment for the reticulated giraffes based on their drinking posture relative to the specific predator visible in the foreground."},
{"type": "image", "uri": "https://storage.googleapis.com/generativeai-downloads/images/generated_elephants_giraffes_zebras_sunset.jpg"}
],
"agent": "deep-research-pro-preview-12-2025",
"background": true
}'
# 2. Poll for results (Replace INTERACTION_ID)
# curl -X GET "https://generativelanguage.googleapis.com/v1beta/interactions/INTERACTION_ID" \
# -H "x-goog-api-key: $GEMINI_API_KEY"
处理长时间运行的任务
Deep Research 是一个多步骤流程,包括规划、搜索、阅读和撰写。此周期通常会超出同步 API 调用的标准超时限制。
必须使用代理才能使用 background=True。该 API 会立即返回部分 Interaction 对象。您可以使用 id 属性检索用于轮询的互动。互动状态将从 in_progress 转换为 completed 或 failed。
流式
Deep Research 支持流式传输,以便实时了解研究进度。您必须设置 stream=True 和 background=True。
以下示例展示了如何启动研究任务并处理流。
最重要的是,它演示了如何从 interaction.start 事件中跟踪 interaction_id。如果发生网络中断,您需要使用此 ID 才能恢复直播。此代码还引入了 event_id 变量,可让您从断开连接时的特定点继续。
Python
stream = client.interactions.create(
input="Research the history of Google TPUs.",
agent="deep-research-pro-preview-12-2025",
background=True,
stream=True,
agent_config={
"type": "deep-research",
"thinking_summaries": "auto"
}
)
interaction_id = None
last_event_id = None
for chunk in stream:
if chunk.event_type == "interaction.start":
interaction_id = chunk.interaction.id
print(f"Interaction started: {interaction_id}")
if chunk.event_id:
last_event_id = chunk.event_id
if chunk.event_type == "content.delta":
if chunk.delta.type == "text":
print(chunk.delta.text, end="", flush=True)
elif chunk.delta.type == "thought_summary":
print(f"Thought: {chunk.delta.content.text}", flush=True)
elif chunk.event_type == "interaction.complete":
print("\nResearch Complete")
JavaScript
const stream = await client.interactions.create({
input: 'Research the history of Google TPUs.',
agent: 'deep-research-pro-preview-12-2025',
background: true,
stream: true,
agent_config: {
type: 'deep-research',
thinking_summaries: 'auto'
}
});
let interactionId;
let lastEventId;
for await (const chunk of stream) {
// 1. Capture Interaction ID
if (chunk.event_type === 'interaction.start') {
interactionId = chunk.interaction.id;
console.log(`Interaction started: ${interactionId}`);
}
// 2. Track IDs for potential reconnection
if (chunk.event_id) lastEventId = chunk.event_id;
// 3. Handle Content
if (chunk.event_type === 'content.delta') {
if (chunk.delta.type === 'text') {
process.stdout.write(chunk.delta.text);
} else if (chunk.delta.type === 'thought_summary') {
console.log(`Thought: ${chunk.delta.content.text}`);
}
} else if (chunk.event_type === 'interaction.complete') {
console.log('\nResearch Complete');
}
}
REST
curl -X POST "https://generativelanguage.googleapis.com/v1beta/interactions?alt=sse" \
-H "Content-Type: application/json" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-d '{
"input": "Research the history of Google TPUs.",
"agent": "deep-research-pro-preview-12-2025",
"background": true,
"stream": true,
"agent_config": {
"type": "deep-research",
"thinking_summaries": "auto"
}
}'
# Note: Look for the 'interaction.start' event to get the interaction ID.
正在重新连接到数据流
在长时间运行的研究任务期间,可能会发生网络中断。为了妥善处理此问题,您的应用应捕获连接错误,并使用 client.interactions.get() 恢复流。
您必须提供两个值才能恢复:
- 互动 ID:从初始数据流中的
interaction.start事件获取。 - Last Event ID: 上一个成功处理的事件的 ID。这会告知服务器在特定时间点之后恢复发送事件。如果未提供,您将获得流的开头。
以下示例展示了一种弹性模式:尝试以流式传输初始 create 请求,并在连接断开时回退到 get 循环。
Python
import time
from google import genai
client = genai.Client()
# Configuration
agent_name = 'deep-research-pro-preview-12-2025'
prompt = 'Compare golang SDK test frameworks'
# State tracking
last_event_id = None
interaction_id = None
is_complete = False
def process_stream(event_stream):
"""Helper to process events from any stream source."""
global last_event_id, interaction_id, is_complete
for event in event_stream:
# Capture Interaction ID
if event.event_type == "interaction.start":
interaction_id = event.interaction.id
print(f"Interaction started: {interaction_id}")
# Capture Event ID
if event.event_id:
last_event_id = event.event_id
# Print content
if event.event_type == "content.delta":
if event.delta.type == "text":
print(event.delta.text, end="", flush=True)
elif event.delta.type == "thought_summary":
print(f"Thought: {event.delta.content.text}", flush=True)
# Check completion
if event.event_type in ['interaction.complete', 'error']:
is_complete = True
# 1. Attempt initial streaming request
try:
print("Starting Research...")
initial_stream = client.interactions.create(
input=prompt,
agent=agent_name,
background=True,
stream=True,
agent_config={
"type": "deep-research",
"thinking_summaries": "auto"
}
)
process_stream(initial_stream)
except Exception as e:
print(f"\nInitial connection dropped: {e}")
# 2. Reconnection Loop
# If the code reaches here and is_complete is False, we resume using .get()
while not is_complete and interaction_id:
print(f"\nConnection lost. Resuming from event {last_event_id}...")
time.sleep(2)
try:
resume_stream = client.interactions.get(
id=interaction_id,
stream=True,
last_event_id=last_event_id
)
process_stream(resume_stream)
except Exception as e:
print(f"Reconnection failed, retrying... ({e})")
JavaScript
let lastEventId;
let interactionId;
let isComplete = false;
// Helper to handle the event logic
const handleStream = async (stream) => {
for await (const chunk of stream) {
if (chunk.event_type === 'interaction.start') {
interactionId = chunk.interaction.id;
}
if (chunk.event_id) lastEventId = chunk.event_id;
if (chunk.event_type === 'content.delta') {
if (chunk.delta.type === 'text') {
process.stdout.write(chunk.delta.text);
} else if (chunk.delta.type === 'thought_summary') {
console.log(`Thought: ${chunk.delta.content.text}`);
}
} else if (chunk.event_type === 'interaction.complete') {
isComplete = true;
}
}
};
// 1. Start the task with streaming
try {
const stream = await client.interactions.create({
input: 'Compare golang SDK test frameworks',
agent: 'deep-research-pro-preview-12-2025',
background: true,
stream: true,
agent_config: {
type: 'deep-research',
thinking_summaries: 'auto'
}
});
await handleStream(stream);
} catch (e) {
console.log('\nInitial stream interrupted.');
}
// 2. Reconnect Loop
while (!isComplete && interactionId) {
console.log(`\nReconnecting to interaction ${interactionId} from event ${lastEventId}...`);
try {
const stream = await client.interactions.get(interactionId, {
stream: true,
last_event_id: lastEventId
});
await handleStream(stream);
} catch (e) {
console.log('Reconnection failed, retrying in 2s...');
await new Promise(resolve => setTimeout(resolve, 2000));
}
}
REST
# 1. Start the research task (Initial Stream)
# Watch for event: interaction.start to get the INTERACTION_ID
# Watch for "event_id" fields to get the LAST_EVENT_ID
curl -X POST "https://generativelanguage.googleapis.com/v1beta/interactions?alt=sse" \
-H "Content-Type: application/json" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-d '{
"input": "Compare golang SDK test frameworks",
"agent": "deep-research-pro-preview-12-2025",
"background": true,
"stream": true,
"agent_config": {
"type": "deep-research",
"thinking_summaries": "auto"
}
}'
# ... Connection interrupted ...
# 2. Reconnect (Resume Stream)
# Pass the INTERACTION_ID and the LAST_EVENT_ID you saved.
curl -X GET "https://generativelanguage.googleapis.com/v1beta/interactions/INTERACTION_ID?stream=true&last_event_id=LAST_EVENT_ID&alt=sse" \
-H "x-goog-api-key: $GEMINI_API_KEY"
后续问题和互动
在代理返回最终报告后,您可以使用 previous_interaction_id 继续对话。这样,您就可以针对研究的特定部分请求澄清、总结或详细说明,而无需重新开始整个任务。
Python
import time
from google import genai
client = genai.Client()
interaction = client.interactions.create(
input="Can you elaborate on the second point in the report?",
model="gemini-3-pro-preview",
previous_interaction_id="COMPLETED_INTERACTION_ID"
)
print(interaction.outputs[-1].text)
JavaScript
const interaction = await client.interactions.create({
input: 'Can you elaborate on the second point in the report?',
agent: 'deep-research-pro-preview-12-2025',
previous_interaction_id: 'COMPLETED_INTERACTION_ID'
});
console.log(interaction.outputs[-1].text);
REST
curl -X POST "https://generativelanguage.googleapis.com/v1beta/interactions" \
-H "Content-Type: application/json" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-d '{
"input": "Can you elaborate on the second point in the report?",
"agent": "deep-research-pro-preview-12-2025",
"previous_interaction_id": "COMPLETED_INTERACTION_ID"
}'
何时使用 Gemini Deep Research 代理
Deep Research 不仅仅是一个模型,更是一个智能体。它最适合需要“开箱即用的分析师”方法而非低延迟聊天的工作负载。
| 功能 | 标准 Gemini 模型 | Gemini Deep Research Agent |
|---|---|---|
| 延迟时间 | 秒 | 分钟(异步/后台) |
| 流程 | 生成 -> 输出 | 规划 -> 搜索 -> 阅读 -> 迭代 -> 输出 |
| 输出 | 对话文本、代码、简短摘要 | 详细报告、长篇分析、比较表格 |
| 适用场景 | 聊天机器人、提取、创意写作 | 市场分析、尽职调查、文献综述、竞争格局 |
适用范围和定价
您可以使用 Google AI Studio 和 Gemini API 中的 Interactions API 来访问 Gemini Deep Research Agent。
价格遵循随用随付模式,具体取决于基础 Gemini 3 Pro 模型和代理使用的特定工具。与标准聊天请求(一个请求对应一个输出)不同,Deep Research 任务是一种代理工作流。单个请求会触发一个自主循环,包括规划、搜索、阅读和推理。
估算费用
费用因所需研究的深度而异。智能体会自主确定需要阅读和搜索多少内容才能回答您的提示。
- 标准研究任务:对于需要中等程度分析的典型查询,代理可能会使用约 80 个搜索查询、约 25 万个输入令牌(约 50-70% 为缓存令牌)和约 6 万个输出令牌。
- 估计总价:每项任务约 2.00 美元 - 3.00 美元
- 复杂的研究任务:对于深入的竞争格局分析或广泛的尽职调查,智能体可能会使用多达约 160 个搜索查询、约 90 万个输入令牌(约 50-70% 为缓存令牌)和约 8 万个输出令牌。
- 估计总价:每项任务约 3.00 美元至 5.00 美元
安全注意事项
让代理访问网络和您的私人文件需要仔细考虑安全风险。
- 使用文件进行提示注入:代理会读取您提供的文件的内容。确保上传的文档(PDF、文本文件)来自可信来源。恶意文件可能包含旨在操纵代理输出的隐藏文本。
- 网络内容风险:代理会搜索公开网络。虽然我们实现了强大的安全过滤功能,但代理仍有可能遇到并处理恶意网页。建议您查看回答中提供的
citations,以验证来源。 - 数据渗出:如果您还允许代理浏览网页,那么在要求代理总结敏感的内部数据时,请务必谨慎。
最佳做法
- 针对未知情况的提示:指示代理如何处理缺失的数据。 例如,在提示中添加“如果无法提供 2025 年的具体数据,请明确说明这些数据是预测数据还是无法提供,而不是进行估计”。
- 提供背景信息:通过在输入提示中直接提供背景信息或限制条件,让代理根据这些信息进行研究。
- 多模态输入 Deep Research Agent 支持多模态输入。 请谨慎使用,因为这会增加费用并导致上下文窗口溢出风险。
限制
- Beta 版状态:Interactions API 目前为公开 Beta 版。功能和架构可能会发生变化。
- 自定义工具:您目前无法为 Deep Research 代理提供自定义的函数调用工具或远程 MCP(模型上下文协议)服务器。
- 结构化输出和方案审批:Deep Research Agent 目前不支持人工审批的方案或结构化输出。
- 最长研究时间:Deep Research 代理的最长研究时间为 60 分钟。大多数任务应该会在 20 分钟内完成。
- 商店要求:使用
background=True执行代理需要store=True。 - Google 搜索: Google 搜索默认处于启用状态,并且特定限制适用于接地结果。
- 音频输入:不支持音频输入。
后续步骤
- 详细了解 Interactions API。
- 了解为该代理提供支持的 Gemini 3 Pro 模型。
- 了解如何使用文件搜索工具来使用您自己的数据。