Gemini Deep Research Agent

Gemini Deep Research 代理程式會自主規劃、執行及整合多步驟研究工作。這項工具採用 Gemini 3 Pro,可透過網路搜尋和您自己的資料,在複雜的資訊環境中導覽,並生成詳細的報告,且會標註完整出處。

研究工作需要反覆搜尋和閱讀,可能需要幾分鐘才能完成。您必須使用背景執行 (設定 background=true),以非同步方式執行代理程式並輪詢結果。詳情請參閱「處理長時間執行的工作」。

以下範例說明如何在背景啟動研究工作,並輪詢結果。

Python

import time
from google import genai

client = genai.Client()

interaction = client.interactions.create(
    input="Research the history of Google TPUs.",
    agent='deep-research-pro-preview-12-2025',
    background=True
)

print(f"Research started: {interaction.id}")

while True:
    interaction = client.interactions.get(interaction.id)
    if interaction.status == "completed":
        print(interaction.outputs[-1].text)
        break
    elif interaction.status == "failed":
        print(f"Research failed: {interaction.error}")
        break
    time.sleep(10)

JavaScript

import { GoogleGenAI } from '@google/genai';

const client = new GoogleGenAI({});

const interaction = await client.interactions.create({
    input: 'Research the history of Google TPUs.',
    agent: 'deep-research-pro-preview-12-2025',
    background: true
});

console.log(`Research started: ${interaction.id}`);

while (true) {
    const result = await client.interactions.get(interaction.id);
    if (result.status === 'completed') {
        console.log(result.outputs[result.outputs.length - 1].text);
        break;
    } else if (result.status === 'failed') {
        console.log(`Research failed: ${result.error}`);
        break;
    }
    await new Promise(resolve => setTimeout(resolve, 10000));
}

REST

# 1. Start the research task
curl -X POST "https://generativelanguage.googleapis.com/v1beta/interactions" \
-H "Content-Type: application/json" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-d '{
    "input": "Research the history of Google TPUs.",
    "agent": "deep-research-pro-preview-12-2025",
    "background": true
}'

# 2. Poll for results (Replace INTERACTION_ID)
# curl -X GET "https://generativelanguage.googleapis.com/v1beta/interactions/INTERACTION_ID" \
# -H "x-goog-api-key: $GEMINI_API_KEY"

以自有資料進行研究

Deep Research 可存取多種工具,根據預設,服務專員可以使用 google_searchurl_context 工具存取公開網際網路上的資訊。預設情況下,您不需要指定這些工具。不過,如果您想使用 File Search 工具,讓代理程式存取自己的資料,就必須按照下列範例新增該工具。

Python

import time
from google import genai

client = genai.Client()

interaction = client.interactions.create(
    input="Compare our 2025 fiscal year report against current public web news.",
    agent="deep-research-pro-preview-12-2025",
    background=True,
    tools=[
        {
            "type": "file_search",
            "file_search_store_names": ['fileSearchStores/my-store-name']
        }
    ]
)

JavaScript

const interaction = await client.interactions.create({
    input: 'Compare our 2025 fiscal year report against current public web news.',
    agent: 'deep-research-pro-preview-12-2025',
    background: true,
    tools: [
        { type: 'file_search', file_search_store_names: ['fileSearchStores/my-store-name'] },
    ]
});

REST

curl -X POST "https://generativelanguage.googleapis.com/v1beta/interactions" \
-H "Content-Type: application/json" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-d '{
    "input": "Compare our 2025 fiscal year report against current public web news.",
    "agent": "deep-research-pro-preview-12-2025",
    "background": true,
    "tools": [
        {"type": "file_search", "file_search_store_names": ["fileSearchStores/my-store-name"]},
    ]
}'

可操控性和格式

您可以在提示中提供具體的格式設定指示,引導代理程式輸出內容。您可以將報表劃分為特定章節和子章節、加入資料表,或針對不同目標對象調整語氣 (例如「技術」、「主管」、「休閒」)。

在輸入文字中明確指明所需的輸出格式。

Python

prompt = """
Research the competitive landscape of EV batteries.

Format the output as a technical report with the following structure:
1. Executive Summary
2. Key Players (Must include a data table comparing capacity and chemistry)
3. Supply Chain Risks
"""

interaction = client.interactions.create(
    input=prompt,
    agent="deep-research-pro-preview-12-2025",
    background=True
)

JavaScript

const prompt = `
Research the competitive landscape of EV batteries.

Format the output as a technical report with the following structure:
1. Executive Summary
2. Key Players (Must include a data table comparing capacity and chemistry)
3. Supply Chain Risks
`;

const interaction = await client.interactions.create({
    input: prompt,
    agent: 'deep-research-pro-preview-12-2025',
    background: true,
});

REST

curl -X POST "https://generativelanguage.googleapis.com/v1beta/interactions" \
-H "Content-Type: application/json" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-d '{
    "input": "Research the competitive landscape of EV batteries.\n\nFormat the output as a technical report with the following structure: \n1. Executive Summary\n2. Key Players (Must include a data table comparing capacity and chemistry)\n3. Supply Chain Risks",
    "agent": "deep-research-pro-preview-12-2025",
    "background": true
}'

處理長時間執行的工作

Deep Research 是一個多步驟程序,包括規劃、搜尋、閱讀和撰寫。這個週期通常會超過同步 API 呼叫的標準逾時限制。

服務專員必須使用 background=True。API 會立即傳回部分 Interaction 物件。您可以使用 id 屬性擷取輪詢的互動。互動狀態會從 in_progress 轉換為 completedfailed

串流

Deep Research 支援串流功能,可即時接收研究進度更新。您必須設定 stream=Truebackground=True

以下範例說明如何啟動研究工作及處理串流。 最重要的是,這個範例會示範如何追蹤 interaction_id 中的 interaction.start 事件。如果發生網路中斷,您需要這個 ID 才能繼續串流。這段程式碼也導入了 event_id 變數,可讓您從中斷連線的特定時間點繼續作業。

Python

stream = client.interactions.create(
    input="Research the history of Google TPUs.",
    agent="deep-research-pro-preview-12-2025",
    background=True,
    stream=True,
    agent_config={
        "type": "deep-research",
        "thinking_summaries": "auto"
    }
)

interaction_id = None
last_event_id = None

for chunk in stream:
    if chunk.event_type == "interaction.start":
        interaction_id = chunk.interaction.id
        print(f"Interaction started: {interaction_id}")

    if chunk.event_id:
        last_event_id = chunk.event_id

    if chunk.event_type == "content.delta":
        if chunk.delta.type == "text":
            print(chunk.delta.text, end="", flush=True)
        elif chunk.delta.type == "thought_summary":
            print(f"Thought: {chunk.delta.content.text}", flush=True)

    elif chunk.event_type == "interaction.complete":
        print("\nResearch Complete")

JavaScript

const stream = await client.interactions.create({
    input: 'Research the history of Google TPUs.',
    agent: 'deep-research-pro-preview-12-2025',
    background: true,
    stream: true,
    agent_config: {
        type: 'deep-research',
        thinking_summaries: 'auto'
    }
});

let interactionId;
let lastEventId;

for await (const chunk of stream) {
    // 1. Capture Interaction ID
    if (chunk.event_type === 'interaction.start') {
        interactionId = chunk.interaction.id;
        console.log(`Interaction started: ${interactionId}`);
    }

    // 2. Track IDs for potential reconnection
    if (chunk.event_id) lastEventId = chunk.event_id;

    // 3. Handle Content
    if (chunk.event_type === 'content.delta') {
        if (chunk.delta.type === 'text') {
            process.stdout.write(chunk.delta.text);
        } else if (chunk.delta.type === 'thought_summary') {
            console.log(`Thought: ${chunk.delta.content.text}`);
        }
    } else if (chunk.event_type === 'interaction.complete') {
        console.log('\nResearch Complete');
    }
}

REST

curl -X POST "https://generativelanguage.googleapis.com/v1beta/interactions?alt=sse" \
-H "Content-Type: application/json" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-d '{
    "input": "Research the history of Google TPUs.",
    "agent": "deep-research-pro-preview-12-2025",
    "background": true,
    "stream": true,
    "agent_config": {
        "type": "deep-research",
        "thinking_summaries": "auto"
    }
}'
# Note: Look for the 'interaction.start' event to get the interaction ID.

重新連線至串流

長時間執行研究工作時,可能會發生網路中斷情形。如要妥善處理這項問題,應用程式應擷取連線錯誤,並使用 client.interactions.get() 繼續串流。

如要繼續,您必須提供兩個值:

  1. 互動 ID:從初始串流中的 interaction.start 事件取得。
  2. 最後一個事件 ID:最後一個成功處理的事件 ID。這會告知伺服器在該特定時間點之後繼續傳送事件。如未提供,系統會傳回串流的開頭。

以下範例示範彈性模式:嘗試串流初始 create 要求,並在連線中斷時回溯至 get 迴圈。

Python

import time
from google import genai

client = genai.Client()

# Configuration
agent_name = 'deep-research-pro-preview-12-2025'
prompt = 'Compare golang SDK test frameworks'

# State tracking
last_event_id = None
interaction_id = None
is_complete = False

def process_stream(event_stream):
    """Helper to process events from any stream source."""
    global last_event_id, interaction_id, is_complete
    for event in event_stream:
        # Capture Interaction ID
        if event.event_type == "interaction.start":
            interaction_id = event.interaction.id
            print(f"Interaction started: {interaction_id}")

        # Capture Event ID
        if event.event_id:
            last_event_id = event.event_id

        # Print content
        if event.event_type == "content.delta":
            if event.delta.type == "text":
                print(event.delta.text, end="", flush=True)
            elif event.delta.type == "thought_summary":
                print(f"Thought: {event.delta.content.text}", flush=True)

        # Check completion
        if event.event_type in ['interaction.complete', 'error']:
            is_complete = True

# 1. Attempt initial streaming request
try:
    print("Starting Research...")
    initial_stream = client.interactions.create(
        input=prompt,
        agent=agent_name,
        background=True,
        stream=True,
        agent_config={
            "type": "deep-research",
            "thinking_summaries": "auto"
        }
    )
    process_stream(initial_stream)
except Exception as e:
    print(f"\nInitial connection dropped: {e}")

# 2. Reconnection Loop
# If the code reaches here and is_complete is False, we resume using .get()
while not is_complete and interaction_id:
    print(f"\nConnection lost. Resuming from event {last_event_id}...")
    time.sleep(2) 

    try:
        resume_stream = client.interactions.get(
            id=interaction_id,
            stream=True,
            last_event_id=last_event_id
        )
        process_stream(resume_stream)
    except Exception as e:
        print(f"Reconnection failed, retrying... ({e})")

JavaScript

let lastEventId;
let interactionId;
let isComplete = false;

// Helper to handle the event logic
const handleStream = async (stream) => {
    for await (const chunk of stream) {
        if (chunk.event_type === 'interaction.start') {
            interactionId = chunk.interaction.id;
        }
        if (chunk.event_id) lastEventId = chunk.event_id;

        if (chunk.event_type === 'content.delta') {
            if (chunk.delta.type === 'text') {
                process.stdout.write(chunk.delta.text);
            } else if (chunk.delta.type === 'thought_summary') {
                console.log(`Thought: ${chunk.delta.content.text}`);
            }
        } else if (chunk.event_type === 'interaction.complete') {
            isComplete = true;
        }
    }
};

// 1. Start the task with streaming
try {
    const stream = await client.interactions.create({
        input: 'Compare golang SDK test frameworks',
        agent: 'deep-research-pro-preview-12-2025',
        background: true,
        stream: true,
        agent_config: {
            type: 'deep-research',
            thinking_summaries: 'auto'
        }
    });
    await handleStream(stream);
} catch (e) {
    console.log('\nInitial stream interrupted.');
}

// 2. Reconnect Loop
while (!isComplete && interactionId) {
    console.log(`\nReconnecting to interaction ${interactionId} from event ${lastEventId}...`);
    try {
        const stream = await client.interactions.get(interactionId, {
            stream: true,
            last_event_id: lastEventId
        });
        await handleStream(stream);
    } catch (e) {
        console.log('Reconnection failed, retrying in 2s...');
        await new Promise(resolve => setTimeout(resolve, 2000));
    }
}

REST

# 1. Start the research task (Initial Stream)
# Watch for event: interaction.start to get the INTERACTION_ID
# Watch for "event_id" fields to get the LAST_EVENT_ID
curl -X POST "https://generativelanguage.googleapis.com/v1beta/interactions?alt=sse" \
-H "Content-Type: application/json" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-d '{
    "input": "Compare golang SDK test frameworks",
    "agent": "deep-research-pro-preview-12-2025",
    "background": true,
    "stream": true,
    "agent_config": {
        "type": "deep-research",
        "thinking_summaries": "auto"
    }
}'

# ... Connection interrupted ...

# 2. Reconnect (Resume Stream)
# Pass the INTERACTION_ID and the LAST_EVENT_ID you saved.
curl -X GET "https://generativelanguage.googleapis.com/v1beta/interactions/INTERACTION_ID?stream=true&last_event_id=LAST_EVENT_ID&alt=sse" \
-H "x-goog-api-key: $GEMINI_API_KEY"

後續問題和互動

代理程式傳回最終報告後,你可以使用 previous_interaction_id 繼續對話。這樣一來,您不必重新啟動整個工作,就能要求釐清、總結或詳細說明研究的特定部分。

Python

import time
from google import genai

client = genai.Client()

interaction = client.interactions.create(
    input="Can you elaborate on the second point in the report?",
    model="gemini-3-pro-preview",
    previous_interaction_id="COMPLETED_INTERACTION_ID"
)

print(interaction.outputs[-1].text)

JavaScript

const interaction = await client.interactions.create({
    input: 'Can you elaborate on the second point in the report?',
    agent: 'deep-research-pro-preview-12-2025',
    previous_interaction_id: 'COMPLETED_INTERACTION_ID'
});
console.log(interaction.outputs[-1].text);

REST

curl -X POST "https://generativelanguage.googleapis.com/v1beta/interactions" \
-H "Content-Type: application/json" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-d '{
    "input": "Can you elaborate on the second point in the report?",
    "agent": "deep-research-pro-preview-12-2025",
    "previous_interaction_id": "COMPLETED_INTERACTION_ID"
}'

Gemini Deep Research 代理的使用時機

Deep Research 是代理,而不只是模型。這項功能最適合需要「分析師即時服務」的工作負載,而非低延遲的即時通訊。

功能 標準 Gemini 模型 Gemini Deep Research Agent
延遲 分鐘 (非同步/背景)
流程 生成 -> 輸出 規劃 -> 搜尋 -> 閱讀 -> 疊代 -> 輸出
輸出內容 對話文字、程式碼、簡短摘要 詳細報表、長篇分析、比較表
適用情境 聊天機器人、擷取、創意寫作 市場分析、盡職調查、文獻回顧、競爭環境

適用情形與定價

  • 適用情形:可透過 Google AI Studio 和 Gemini API 中的 Interactions API 存取。
  • 定價:如需特定費率和詳細資料,請參閱定價頁面

安全考量

授予代理程式網路和私人檔案的存取權時,請務必謹慎評估安全風險。

  • 使用檔案進行提示注入:代理程式會讀取您提供的檔案內容。請確認上傳的文件 (PDF、文字檔) 來自可信來源。惡意檔案可能含有隱藏文字,用於操縱代理程式的輸出內容。
  • 網路內容風險:代理程式會搜尋公開網路,雖然我們導入了強大的安全篩選器,但代理程式仍可能遇到並處理惡意網頁。建議您查看回覆中citations提供的 來源,確認資訊是否正確。
  • 資料外洩:如果允許 Agent 瀏覽網頁,要求 Agent 摘要說明機密內部資料時請務必謹慎。

最佳做法

  • 提示不明事項:指示服務專員如何處理遺漏的資料。 舉例來說,在提示中加入「如果沒有 2025 年的具體數據,請明確指出這些是預測或無法取得,而非估算」
  • 提供脈絡:直接在輸入提示中提供背景資訊或限制,做為代理程式研究的基準。
  • 多模態輸入內容:Deep Research 代理支援多模態輸入內容。請謹慎使用,因為這會增加成本,並有脈絡窗口溢位的風險。

限制

  • Beta 版狀態:Interactions API 目前為公開 Beta 版。功能和結構定義可能會有所異動。
  • 自訂工具:目前無法為 Deep Research 代理程式提供自訂函式呼叫工具或遠端 MCP (Model Context Protocol) 伺服器。
  • 結構化輸出內容和計畫核准:Deep Research Agent 目前不支援人工核准的計畫或結構化輸出內容。
  • 研究時間上限:Deep Research 代理的研究時間上限為 60 分鐘。大多數工作應可在 20 分鐘內完成。
  • 商店規定:使用 background=True 執行代理程式時,需要 store=True
  • Google 搜尋: Google 搜尋預設為啟用,且特定限制適用於基礎結果。
  • 音訊輸入:不支援音訊輸入。

後續步驟