Gemini Deep Research Agent

Gemini Deep Research Agent 可自主规划、执行和整合多步骤研究任务。它由 Gemini 3 Pro 提供支持,可使用网络搜索和您自己的数据来浏览复杂的信息环境,生成带有引用信息的详细报告。

研究任务涉及迭代搜索和阅读,可能需要几分钟才能完成。您必须使用后台执行(设置 background=true)来异步运行代理并轮询结果。如需了解详情,请参阅处理长时间运行的任务

以下示例展示了如何在后台启动研究任务并轮询结果。

Python

import time
from google import genai

client = genai.Client()

interaction = client.interactions.create(
    input="Research the history of Google TPUs.",
    agent='deep-research-pro-preview-12-2025',
    background=True
)

print(f"Research started: {interaction.id}")

while True:
    interaction = client.interactions.get(interaction.id)
    if interaction.status == "completed":
        print(interaction.outputs[-1].text)
        break
    elif interaction.status == "failed":
        print(f"Research failed: {interaction.error}")
        break
    time.sleep(10)

JavaScript

import { GoogleGenAI } from '@google/genai';

const client = new GoogleGenAI({});

const interaction = await client.interactions.create({
    input: 'Research the history of Google TPUs.',
    agent: 'deep-research-pro-preview-12-2025',
    background: true
});

console.log(`Research started: ${interaction.id}`);

while (true) {
    const result = await client.interactions.get(interaction.id);
    if (result.status === 'completed') {
        console.log(result.outputs[result.outputs.length - 1].text);
        break;
    } else if (result.status === 'failed') {
        console.log(`Research failed: ${result.error}`);
        break;
    }
    await new Promise(resolve => setTimeout(resolve, 10000));
}

REST

# 1. Start the research task
curl -X POST "https://generativelanguage.googleapis.com/v1beta/interactions" \
-H "Content-Type: application/json" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-d '{
    "input": "Research the history of Google TPUs.",
    "agent": "deep-research-pro-preview-12-2025",
    "background": true
}'

# 2. Poll for results (Replace INTERACTION_ID)
# curl -X GET "https://generativelanguage.googleapis.com/v1beta/interactions/INTERACTION_ID" \
# -H "x-goog-api-key: $GEMINI_API_KEY"

使用自己的数据进行研究

Deep Research 可以访问各种工具。默认情况下,代理可以使用 google_searchurl_context 工具访问公共互联网上的信息。默认情况下,您无需指定这些工具。不过,如果您还想使用 File Search 工具授予代理对您自己数据的访问权限,则需要按照以下示例所示添加该工具。

Python

import time
from google import genai

client = genai.Client()

interaction = client.interactions.create(
    input="Compare our 2025 fiscal year report against current public web news.",
    agent="deep-research-pro-preview-12-2025",
    background=True,
    tools=[
        {
            "type": "file_search",
            "file_search_store_names": ['fileSearchStores/my-store-name']
        }
    ]
)

JavaScript

const interaction = await client.interactions.create({
    input: 'Compare our 2025 fiscal year report against current public web news.',
    agent: 'deep-research-pro-preview-12-2025',
    background: true,
    tools: [
        { type: 'file_search', file_search_store_names: ['fileSearchStores/my-store-name'] },
    ]
});

REST

curl -X POST "https://generativelanguage.googleapis.com/v1beta/interactions" \
-H "Content-Type: application/json" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-d '{
    "input": "Compare our 2025 fiscal year report against current public web news.",
    "agent": "deep-research-pro-preview-12-2025",
    "background": true,
    "tools": [
        {"type": "file_search", "file_search_store_names": ["fileSearchStores/my-store-name"]},
    ]
}'

可操纵性和格式

您可以在提示中提供具体的格式设置说明,从而引导代理的输出。这样,您就可以将报告划分为特定部分和子部分,添加数据表格,或针对不同的受众群体调整语气(例如,“技术”“高管”“休闲”)。

在输入文本中明确定义所需的输出格式。

Python

prompt = """
Research the competitive landscape of EV batteries.

Format the output as a technical report with the following structure:
1. Executive Summary
2. Key Players (Must include a data table comparing capacity and chemistry)
3. Supply Chain Risks
"""

interaction = client.interactions.create(
    input=prompt,
    agent="deep-research-pro-preview-12-2025",
    background=True
)

JavaScript

const prompt = `
Research the competitive landscape of EV batteries.

Format the output as a technical report with the following structure:
1. Executive Summary
2. Key Players (Must include a data table comparing capacity and chemistry)
3. Supply Chain Risks
`;

const interaction = await client.interactions.create({
    input: prompt,
    agent: 'deep-research-pro-preview-12-2025',
    background: true,
});

REST

curl -X POST "https://generativelanguage.googleapis.com/v1beta/interactions" \
-H "Content-Type: application/json" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-d '{
    "input": "Research the competitive landscape of EV batteries.\n\nFormat the output as a technical report with the following structure: \n1. Executive Summary\n2. Key Players (Must include a data table comparing capacity and chemistry)\n3. Supply Chain Risks",
    "agent": "deep-research-pro-preview-12-2025",
    "background": true
}'

处理长时间运行的任务

Deep Research 是一个多步骤流程,包括规划、搜索、阅读和撰写。此周期通常会超出同步 API 调用的标准超时限制。

必须使用代理才能使用 background=True。该 API 会立即返回部分 Interaction 对象。您可以使用 id 属性检索用于轮询的互动。互动状态将从 in_progress 转换为 completedfailed

流式

Deep Research 支持流式传输,以便实时了解研究进度。您必须设置 stream=Truebackground=True

以下示例展示了如何启动研究任务并处理流。 最重要的是,它演示了如何从 interaction.start 事件中跟踪 interaction_id。如果发生网络中断,您需要使用此 ID 才能恢复直播。此代码还引入了一个 event_id 变量,可让您从断开连接的特定点继续。

Python

stream = client.interactions.create(
    input="Research the history of Google TPUs.",
    agent="deep-research-pro-preview-12-2025",
    background=True,
    stream=True,
    agent_config={
        "type": "deep-research",
        "thinking_summaries": "auto"
    }
)

interaction_id = None
last_event_id = None

for chunk in stream:
    if chunk.event_type == "interaction.start":
        interaction_id = chunk.interaction.id
        print(f"Interaction started: {interaction_id}")

    if chunk.event_id:
        last_event_id = chunk.event_id

    if chunk.event_type == "content.delta":
        if chunk.delta.type == "text":
            print(chunk.delta.text, end="", flush=True)
        elif chunk.delta.type == "thought_summary":
            print(f"Thought: {chunk.delta.content.text}", flush=True)

    elif chunk.event_type == "interaction.complete":
        print("\nResearch Complete")

JavaScript

const stream = await client.interactions.create({
    input: 'Research the history of Google TPUs.',
    agent: 'deep-research-pro-preview-12-2025',
    background: true,
    stream: true,
    agent_config: {
        type: 'deep-research',
        thinking_summaries: 'auto'
    }
});

let interactionId;
let lastEventId;

for await (const chunk of stream) {
    // 1. Capture Interaction ID
    if (chunk.event_type === 'interaction.start') {
        interactionId = chunk.interaction.id;
        console.log(`Interaction started: ${interactionId}`);
    }

    // 2. Track IDs for potential reconnection
    if (chunk.event_id) lastEventId = chunk.event_id;

    // 3. Handle Content
    if (chunk.event_type === 'content.delta') {
        if (chunk.delta.type === 'text') {
            process.stdout.write(chunk.delta.text);
        } else if (chunk.delta.type === 'thought_summary') {
            console.log(`Thought: ${chunk.delta.content.text}`);
        }
    } else if (chunk.event_type === 'interaction.complete') {
        console.log('\nResearch Complete');
    }
}

REST

curl -X POST "https://generativelanguage.googleapis.com/v1beta/interactions?alt=sse" \
-H "Content-Type: application/json" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-d '{
    "input": "Research the history of Google TPUs.",
    "agent": "deep-research-pro-preview-12-2025",
    "background": true,
    "stream": true,
    "agent_config": {
        "type": "deep-research",
        "thinking_summaries": "auto"
    }
}'
# Note: Look for the 'interaction.start' event to get the interaction ID.

正在重新连接到数据流

在长时间运行的研究任务期间,可能会发生网络中断。为了妥善处理这种情况,您的应用应捕获连接错误并使用 client.interactions.get() 恢复流。

您必须提供两个值才能恢复:

  1. 互动 ID:从初始流中的 interaction.start 事件获取。
  2. Last Event ID:上次成功处理的事件的 ID。这会告知服务器在特定时间点之后恢复发送事件。如果未提供,您将获得流的开头。

以下示例展示了一种弹性模式:尝试以流式传输初始 create 请求,并在连接断开时回退到 get 循环。

Python

import time
from google import genai

client = genai.Client()

# Configuration
agent_name = 'deep-research-pro-preview-12-2025'
prompt = 'Compare golang SDK test frameworks'

# State tracking
last_event_id = None
interaction_id = None
is_complete = False

def process_stream(event_stream):
    """Helper to process events from any stream source."""
    global last_event_id, interaction_id, is_complete
    for event in event_stream:
        # Capture Interaction ID
        if event.event_type == "interaction.start":
            interaction_id = event.interaction.id
            print(f"Interaction started: {interaction_id}")

        # Capture Event ID
        if event.event_id:
            last_event_id = event.event_id

        # Print content
        if event.event_type == "content.delta":
            if event.delta.type == "text":
                print(event.delta.text, end="", flush=True)
            elif event.delta.type == "thought_summary":
                print(f"Thought: {event.delta.content.text}", flush=True)

        # Check completion
        if event.event_type in ['interaction.complete', 'error']:
            is_complete = True

# 1. Attempt initial streaming request
try:
    print("Starting Research...")
    initial_stream = client.interactions.create(
        input=prompt,
        agent=agent_name,
        background=True,
        stream=True,
        agent_config={
            "type": "deep-research",
            "thinking_summaries": "auto"
        }
    )
    process_stream(initial_stream)
except Exception as e:
    print(f"\nInitial connection dropped: {e}")

# 2. Reconnection Loop
# If the code reaches here and is_complete is False, we resume using .get()
while not is_complete and interaction_id:
    print(f"\nConnection lost. Resuming from event {last_event_id}...")
    time.sleep(2) 

    try:
        resume_stream = client.interactions.get(
            id=interaction_id,
            stream=True,
            last_event_id=last_event_id
        )
        process_stream(resume_stream)
    except Exception as e:
        print(f"Reconnection failed, retrying... ({e})")

JavaScript

let lastEventId;
let interactionId;
let isComplete = false;

// Helper to handle the event logic
const handleStream = async (stream) => {
    for await (const chunk of stream) {
        if (chunk.event_type === 'interaction.start') {
            interactionId = chunk.interaction.id;
        }
        if (chunk.event_id) lastEventId = chunk.event_id;

        if (chunk.event_type === 'content.delta') {
            if (chunk.delta.type === 'text') {
                process.stdout.write(chunk.delta.text);
            } else if (chunk.delta.type === 'thought_summary') {
                console.log(`Thought: ${chunk.delta.content.text}`);
            }
        } else if (chunk.event_type === 'interaction.complete') {
            isComplete = true;
        }
    }
};

// 1. Start the task with streaming
try {
    const stream = await client.interactions.create({
        input: 'Compare golang SDK test frameworks',
        agent: 'deep-research-pro-preview-12-2025',
        background: true,
        stream: true,
        agent_config: {
            type: 'deep-research',
            thinking_summaries: 'auto'
        }
    });
    await handleStream(stream);
} catch (e) {
    console.log('\nInitial stream interrupted.');
}

// 2. Reconnect Loop
while (!isComplete && interactionId) {
    console.log(`\nReconnecting to interaction ${interactionId} from event ${lastEventId}...`);
    try {
        const stream = await client.interactions.get(interactionId, {
            stream: true,
            last_event_id: lastEventId
        });
        await handleStream(stream);
    } catch (e) {
        console.log('Reconnection failed, retrying in 2s...');
        await new Promise(resolve => setTimeout(resolve, 2000));
    }
}

REST

# 1. Start the research task (Initial Stream)
# Watch for event: interaction.start to get the INTERACTION_ID
# Watch for "event_id" fields to get the LAST_EVENT_ID
curl -X POST "https://generativelanguage.googleapis.com/v1beta/interactions?alt=sse" \
-H "Content-Type: application/json" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-d '{
    "input": "Compare golang SDK test frameworks",
    "agent": "deep-research-pro-preview-12-2025",
    "background": true,
    "stream": true,
    "agent_config": {
        "type": "deep-research",
        "thinking_summaries": "auto"
    }
}'

# ... Connection interrupted ...

# 2. Reconnect (Resume Stream)
# Pass the INTERACTION_ID and the LAST_EVENT_ID you saved.
curl -X GET "https://generativelanguage.googleapis.com/v1beta/interactions/INTERACTION_ID?stream=true&last_event_id=LAST_EVENT_ID&alt=sse" \
-H "x-goog-api-key: $GEMINI_API_KEY"

后续问题和互动

在代理返回最终报告后,您可以使用 previous_interaction_id 继续对话。这样,您就可以针对研究的特定部分请求澄清、总结或详细说明,而无需重新开始整个任务。

Python

import time
from google import genai

client = genai.Client()

interaction = client.interactions.create(
    input="Can you elaborate on the second point in the report?",
    model="gemini-3-pro-preview",
    previous_interaction_id="COMPLETED_INTERACTION_ID"
)

print(interaction.outputs[-1].text)

JavaScript

const interaction = await client.interactions.create({
    input: 'Can you elaborate on the second point in the report?',
    agent: 'deep-research-pro-preview-12-2025',
    previous_interaction_id: 'COMPLETED_INTERACTION_ID'
});
console.log(interaction.outputs[-1].text);

REST

curl -X POST "https://generativelanguage.googleapis.com/v1beta/interactions" \
-H "Content-Type: application/json" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-d '{
    "input": "Can you elaborate on the second point in the report?",
    "agent": "deep-research-pro-preview-12-2025",
    "previous_interaction_id": "COMPLETED_INTERACTION_ID"
}'

何时使用 Gemini Deep Research 代理

Deep Research 是一种代理,而不仅仅是一个模型。它最适合需要“开箱即用型分析师”方法的工作负载,而不是低延迟聊天。

功能 标准 Gemini 模型 Gemini Deep Research Agent
延迟时间 分钟(异步/后台)
流程 生成 -> 输出 规划 -> 搜索 -> 阅读 -> 迭代 -> 输出
输出 对话式文本、代码、简短摘要 详细报告、长篇分析、比较表格
最适合 聊天机器人、提取、创意写作 市场分析、尽职调查、文献综述、竞争格局

适用范围和定价

  • 可用性:可通过 Google AI Studio 和 Gemini API 中的 Interactions API 访问。
  • 价格:如需了解具体费率和详情,请参阅价格页面

安全注意事项

让代理访问网络和您的私密文件需要仔细考虑安全风险。

  • 使用文件进行提示注入:代理会读取您提供的文件的内容。确保上传的文档(PDF、文本文件)来自可信来源。恶意文件可能包含旨在操纵代理输出的隐藏文本。
  • 网络内容风险:代理会搜索公开网络。虽然我们实现了强大的安全过滤功能,但代理仍有可能遇到并处理恶意网页。建议您查看回答中提供的 citations,以验证来源。
  • 数据渗漏:如果您还允许代理浏览网页,那么在要求代理总结敏感的内部数据时,请务必谨慎。

最佳做法

  • 针对未知情况的提示:指示代理如何处理缺失的数据。 例如,在提示中添加“如果 2025 年的具体数据不可用,请明确说明这些数据是预测数据还是不可用,而不是进行估计”
  • 提供背景信息:通过在输入提示中直接提供背景信息或限制条件,让代理根据这些信息进行研究。
  • 多模态输入Deep Research Agent 支持多模态输入。 请谨慎使用,因为这会增加费用并导致上下文窗口溢出风险。

限制

  • Beta 版状态:Interactions API 目前为公开 Beta 版。功能和架构可能会发生变化。
  • 自定义工具:您目前无法向 Deep Research 智能体提供自定义的函数调用工具或远程 MCP(模型上下文协议)服务器。
  • 结构化输出和方案审批:Deep Research Agent 目前不支持人工审批的方案或结构化输出。
  • 最长研究时间:Deep Research 代理的最长研究时间为 60 分钟。大多数任务应该会在 20 分钟内完成。
  • 商店要求:使用 background=True 执行代理需要 store=True
  • Google 搜索Google 搜索默认处于启用状态,并且接地结果会受到特定限制
  • 音频输入:不支持音频输入。

后续步骤