Gemini Deep Research Agent 可自主规划、执行和整合多步骤研究任务。它由 Gemini 3 Pro 提供支持,可使用网络搜索和您自己的数据来浏览复杂的信息环境,生成带有引用信息的详细报告。
研究任务涉及迭代搜索和阅读,可能需要几分钟才能完成。您必须使用后台执行(设置 background=true)来异步运行代理并轮询结果。如需了解详情,请参阅处理长时间运行的任务。
以下示例展示了如何在后台启动研究任务并轮询结果。
Python
import time
from google import genai
client = genai.Client()
interaction = client.interactions.create(
input="Research the history of Google TPUs.",
agent='deep-research-pro-preview-12-2025',
background=True
)
print(f"Research started: {interaction.id}")
while True:
interaction = client.interactions.get(interaction.id)
if interaction.status == "completed":
print(interaction.outputs[-1].text)
break
elif interaction.status == "failed":
print(f"Research failed: {interaction.error}")
break
time.sleep(10)
JavaScript
import { GoogleGenAI } from '@google/genai';
const client = new GoogleGenAI({});
const interaction = await client.interactions.create({
input: 'Research the history of Google TPUs.',
agent: 'deep-research-pro-preview-12-2025',
background: true
});
console.log(`Research started: ${interaction.id}`);
while (true) {
const result = await client.interactions.get(interaction.id);
if (result.status === 'completed') {
console.log(result.outputs[result.outputs.length - 1].text);
break;
} else if (result.status === 'failed') {
console.log(`Research failed: ${result.error}`);
break;
}
await new Promise(resolve => setTimeout(resolve, 10000));
}
REST
# 1. Start the research task
curl -X POST "https://generativelanguage.googleapis.com/v1beta/interactions" \
-H "Content-Type: application/json" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-d '{
"input": "Research the history of Google TPUs.",
"agent": "deep-research-pro-preview-12-2025",
"background": true
}'
# 2. Poll for results (Replace INTERACTION_ID)
# curl -X GET "https://generativelanguage.googleapis.com/v1beta/interactions/INTERACTION_ID" \
# -H "x-goog-api-key: $GEMINI_API_KEY"
使用自己的数据进行研究
Deep Research 可以访问各种工具。默认情况下,代理可以使用 google_search 和 url_context 工具访问公共互联网上的信息。默认情况下,您无需指定这些工具。不过,如果您还想使用 File Search 工具授予代理对您自己数据的访问权限,则需要按照以下示例所示添加该工具。
Python
import time
from google import genai
client = genai.Client()
interaction = client.interactions.create(
input="Compare our 2025 fiscal year report against current public web news.",
agent="deep-research-pro-preview-12-2025",
background=True,
tools=[
{
"type": "file_search",
"file_search_store_names": ['fileSearchStores/my-store-name']
}
]
)
JavaScript
const interaction = await client.interactions.create({
input: 'Compare our 2025 fiscal year report against current public web news.',
agent: 'deep-research-pro-preview-12-2025',
background: true,
tools: [
{ type: 'file_search', file_search_store_names: ['fileSearchStores/my-store-name'] },
]
});
REST
curl -X POST "https://generativelanguage.googleapis.com/v1beta/interactions" \
-H "Content-Type: application/json" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-d '{
"input": "Compare our 2025 fiscal year report against current public web news.",
"agent": "deep-research-pro-preview-12-2025",
"background": true,
"tools": [
{"type": "file_search", "file_search_store_names": ["fileSearchStores/my-store-name"]},
]
}'
可操纵性和格式
您可以在提示中提供具体的格式设置说明,从而引导代理的输出。这样,您就可以将报告划分为特定部分和子部分,添加数据表格,或针对不同的受众群体调整语气(例如,“技术”“高管”“休闲”)。
在输入文本中明确定义所需的输出格式。
Python
prompt = """
Research the competitive landscape of EV batteries.
Format the output as a technical report with the following structure:
1. Executive Summary
2. Key Players (Must include a data table comparing capacity and chemistry)
3. Supply Chain Risks
"""
interaction = client.interactions.create(
input=prompt,
agent="deep-research-pro-preview-12-2025",
background=True
)
JavaScript
const prompt = `
Research the competitive landscape of EV batteries.
Format the output as a technical report with the following structure:
1. Executive Summary
2. Key Players (Must include a data table comparing capacity and chemistry)
3. Supply Chain Risks
`;
const interaction = await client.interactions.create({
input: prompt,
agent: 'deep-research-pro-preview-12-2025',
background: true,
});
REST
curl -X POST "https://generativelanguage.googleapis.com/v1beta/interactions" \
-H "Content-Type: application/json" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-d '{
"input": "Research the competitive landscape of EV batteries.\n\nFormat the output as a technical report with the following structure: \n1. Executive Summary\n2. Key Players (Must include a data table comparing capacity and chemistry)\n3. Supply Chain Risks",
"agent": "deep-research-pro-preview-12-2025",
"background": true
}'
处理长时间运行的任务
Deep Research 是一个多步骤流程,包括规划、搜索、阅读和撰写。此周期通常会超出同步 API 调用的标准超时限制。
必须使用代理才能使用 background=True。该 API 会立即返回部分 Interaction 对象。您可以使用 id 属性检索用于轮询的互动。互动状态将从 in_progress 转换为 completed 或 failed。
流式
Deep Research 支持流式传输,以便实时了解研究进度。您必须设置 stream=True 和 background=True。
以下示例展示了如何启动研究任务并处理流。
最重要的是,它演示了如何从 interaction.start 事件中跟踪 interaction_id。如果发生网络中断,您需要使用此 ID 才能恢复直播。此代码还引入了一个 event_id 变量,可让您从断开连接的特定点继续。
Python
stream = client.interactions.create(
input="Research the history of Google TPUs.",
agent="deep-research-pro-preview-12-2025",
background=True,
stream=True,
agent_config={
"type": "deep-research",
"thinking_summaries": "auto"
}
)
interaction_id = None
last_event_id = None
for chunk in stream:
if chunk.event_type == "interaction.start":
interaction_id = chunk.interaction.id
print(f"Interaction started: {interaction_id}")
if chunk.event_id:
last_event_id = chunk.event_id
if chunk.event_type == "content.delta":
if chunk.delta.type == "text":
print(chunk.delta.text, end="", flush=True)
elif chunk.delta.type == "thought_summary":
print(f"Thought: {chunk.delta.content.text}", flush=True)
elif chunk.event_type == "interaction.complete":
print("\nResearch Complete")
JavaScript
const stream = await client.interactions.create({
input: 'Research the history of Google TPUs.',
agent: 'deep-research-pro-preview-12-2025',
background: true,
stream: true,
agent_config: {
type: 'deep-research',
thinking_summaries: 'auto'
}
});
let interactionId;
let lastEventId;
for await (const chunk of stream) {
// 1. Capture Interaction ID
if (chunk.event_type === 'interaction.start') {
interactionId = chunk.interaction.id;
console.log(`Interaction started: ${interactionId}`);
}
// 2. Track IDs for potential reconnection
if (chunk.event_id) lastEventId = chunk.event_id;
// 3. Handle Content
if (chunk.event_type === 'content.delta') {
if (chunk.delta.type === 'text') {
process.stdout.write(chunk.delta.text);
} else if (chunk.delta.type === 'thought_summary') {
console.log(`Thought: ${chunk.delta.content.text}`);
}
} else if (chunk.event_type === 'interaction.complete') {
console.log('\nResearch Complete');
}
}
REST
curl -X POST "https://generativelanguage.googleapis.com/v1beta/interactions?alt=sse" \
-H "Content-Type: application/json" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-d '{
"input": "Research the history of Google TPUs.",
"agent": "deep-research-pro-preview-12-2025",
"background": true,
"stream": true,
"agent_config": {
"type": "deep-research",
"thinking_summaries": "auto"
}
}'
# Note: Look for the 'interaction.start' event to get the interaction ID.
正在重新连接到数据流
在长时间运行的研究任务期间,可能会发生网络中断。为了妥善处理这种情况,您的应用应捕获连接错误并使用 client.interactions.get() 恢复流。
您必须提供两个值才能恢复:
- 互动 ID:从初始流中的
interaction.start事件获取。 - Last Event ID:上次成功处理的事件的 ID。这会告知服务器在特定时间点之后恢复发送事件。如果未提供,您将获得流的开头。
以下示例展示了一种弹性模式:尝试以流式传输初始 create 请求,并在连接断开时回退到 get 循环。
Python
import time
from google import genai
client = genai.Client()
# Configuration
agent_name = 'deep-research-pro-preview-12-2025'
prompt = 'Compare golang SDK test frameworks'
# State tracking
last_event_id = None
interaction_id = None
is_complete = False
def process_stream(event_stream):
"""Helper to process events from any stream source."""
global last_event_id, interaction_id, is_complete
for event in event_stream:
# Capture Interaction ID
if event.event_type == "interaction.start":
interaction_id = event.interaction.id
print(f"Interaction started: {interaction_id}")
# Capture Event ID
if event.event_id:
last_event_id = event.event_id
# Print content
if event.event_type == "content.delta":
if event.delta.type == "text":
print(event.delta.text, end="", flush=True)
elif event.delta.type == "thought_summary":
print(f"Thought: {event.delta.content.text}", flush=True)
# Check completion
if event.event_type in ['interaction.complete', 'error']:
is_complete = True
# 1. Attempt initial streaming request
try:
print("Starting Research...")
initial_stream = client.interactions.create(
input=prompt,
agent=agent_name,
background=True,
stream=True,
agent_config={
"type": "deep-research",
"thinking_summaries": "auto"
}
)
process_stream(initial_stream)
except Exception as e:
print(f"\nInitial connection dropped: {e}")
# 2. Reconnection Loop
# If the code reaches here and is_complete is False, we resume using .get()
while not is_complete and interaction_id:
print(f"\nConnection lost. Resuming from event {last_event_id}...")
time.sleep(2)
try:
resume_stream = client.interactions.get(
id=interaction_id,
stream=True,
last_event_id=last_event_id
)
process_stream(resume_stream)
except Exception as e:
print(f"Reconnection failed, retrying... ({e})")
JavaScript
let lastEventId;
let interactionId;
let isComplete = false;
// Helper to handle the event logic
const handleStream = async (stream) => {
for await (const chunk of stream) {
if (chunk.event_type === 'interaction.start') {
interactionId = chunk.interaction.id;
}
if (chunk.event_id) lastEventId = chunk.event_id;
if (chunk.event_type === 'content.delta') {
if (chunk.delta.type === 'text') {
process.stdout.write(chunk.delta.text);
} else if (chunk.delta.type === 'thought_summary') {
console.log(`Thought: ${chunk.delta.content.text}`);
}
} else if (chunk.event_type === 'interaction.complete') {
isComplete = true;
}
}
};
// 1. Start the task with streaming
try {
const stream = await client.interactions.create({
input: 'Compare golang SDK test frameworks',
agent: 'deep-research-pro-preview-12-2025',
background: true,
stream: true,
agent_config: {
type: 'deep-research',
thinking_summaries: 'auto'
}
});
await handleStream(stream);
} catch (e) {
console.log('\nInitial stream interrupted.');
}
// 2. Reconnect Loop
while (!isComplete && interactionId) {
console.log(`\nReconnecting to interaction ${interactionId} from event ${lastEventId}...`);
try {
const stream = await client.interactions.get(interactionId, {
stream: true,
last_event_id: lastEventId
});
await handleStream(stream);
} catch (e) {
console.log('Reconnection failed, retrying in 2s...');
await new Promise(resolve => setTimeout(resolve, 2000));
}
}
REST
# 1. Start the research task (Initial Stream)
# Watch for event: interaction.start to get the INTERACTION_ID
# Watch for "event_id" fields to get the LAST_EVENT_ID
curl -X POST "https://generativelanguage.googleapis.com/v1beta/interactions?alt=sse" \
-H "Content-Type: application/json" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-d '{
"input": "Compare golang SDK test frameworks",
"agent": "deep-research-pro-preview-12-2025",
"background": true,
"stream": true,
"agent_config": {
"type": "deep-research",
"thinking_summaries": "auto"
}
}'
# ... Connection interrupted ...
# 2. Reconnect (Resume Stream)
# Pass the INTERACTION_ID and the LAST_EVENT_ID you saved.
curl -X GET "https://generativelanguage.googleapis.com/v1beta/interactions/INTERACTION_ID?stream=true&last_event_id=LAST_EVENT_ID&alt=sse" \
-H "x-goog-api-key: $GEMINI_API_KEY"
后续问题和互动
在代理返回最终报告后,您可以使用 previous_interaction_id 继续对话。这样,您就可以针对研究的特定部分请求澄清、总结或详细说明,而无需重新开始整个任务。
Python
import time
from google import genai
client = genai.Client()
interaction = client.interactions.create(
input="Can you elaborate on the second point in the report?",
model="gemini-3-pro-preview",
previous_interaction_id="COMPLETED_INTERACTION_ID"
)
print(interaction.outputs[-1].text)
JavaScript
const interaction = await client.interactions.create({
input: 'Can you elaborate on the second point in the report?',
agent: 'deep-research-pro-preview-12-2025',
previous_interaction_id: 'COMPLETED_INTERACTION_ID'
});
console.log(interaction.outputs[-1].text);
REST
curl -X POST "https://generativelanguage.googleapis.com/v1beta/interactions" \
-H "Content-Type: application/json" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-d '{
"input": "Can you elaborate on the second point in the report?",
"agent": "deep-research-pro-preview-12-2025",
"previous_interaction_id": "COMPLETED_INTERACTION_ID"
}'
何时使用 Gemini Deep Research 代理
Deep Research 是一种代理,而不仅仅是一个模型。它最适合需要“开箱即用型分析师”方法的工作负载,而不是低延迟聊天。
| 功能 | 标准 Gemini 模型 | Gemini Deep Research Agent |
|---|---|---|
| 延迟时间 | 秒 | 分钟(异步/后台) |
| 流程 | 生成 -> 输出 | 规划 -> 搜索 -> 阅读 -> 迭代 -> 输出 |
| 输出 | 对话式文本、代码、简短摘要 | 详细报告、长篇分析、比较表格 |
| 最适合 | 聊天机器人、提取、创意写作 | 市场分析、尽职调查、文献综述、竞争格局 |
适用范围和定价
- 可用性:可通过 Google AI Studio 和 Gemini API 中的 Interactions API 访问。
- 价格:如需了解具体费率和详情,请参阅价格页面。
安全注意事项
让代理访问网络和您的私密文件需要仔细考虑安全风险。
- 使用文件进行提示注入:代理会读取您提供的文件的内容。确保上传的文档(PDF、文本文件)来自可信来源。恶意文件可能包含旨在操纵代理输出的隐藏文本。
- 网络内容风险:代理会搜索公开网络。虽然我们实现了强大的安全过滤功能,但代理仍有可能遇到并处理恶意网页。建议您查看回答中提供的
citations,以验证来源。 - 数据渗漏:如果您还允许代理浏览网页,那么在要求代理总结敏感的内部数据时,请务必谨慎。
最佳做法
- 针对未知情况的提示:指示代理如何处理缺失的数据。 例如,在提示中添加“如果 2025 年的具体数据不可用,请明确说明这些数据是预测数据还是不可用,而不是进行估计”。
- 提供背景信息:通过在输入提示中直接提供背景信息或限制条件,让代理根据这些信息进行研究。
- 多模态输入Deep Research Agent 支持多模态输入。 请谨慎使用,因为这会增加费用并导致上下文窗口溢出风险。
限制
- Beta 版状态:Interactions API 目前为公开 Beta 版。功能和架构可能会发生变化。
- 自定义工具:您目前无法向 Deep Research 智能体提供自定义的函数调用工具或远程 MCP(模型上下文协议)服务器。
- 结构化输出和方案审批:Deep Research Agent 目前不支持人工审批的方案或结构化输出。
- 最长研究时间:Deep Research 代理的最长研究时间为 60 分钟。大多数任务应该会在 20 分钟内完成。
- 商店要求:使用
background=True执行代理需要store=True。 - Google 搜索: Google 搜索默认处于启用状态,并且接地结果会受到特定限制。
- 音频输入:不支持音频输入。
后续步骤
- 详细了解 Interactions API。
- 了解为该代理提供支持的 Gemini 3 Pro 模型。
- 了解如何使用文件搜索工具来使用您自己的数据。