Gemini Deep Research Agent

Gemini Deep Research 代理程式會自主規劃、執行及整合多步驟研究工作。這項工具採用 Gemini 3 Pro,可透過網路搜尋和您提供的資料,在複雜的資訊環境中找出所需內容,並生成詳細的報告,且會標註完整出處。

研究工作需要反覆搜尋和閱讀,可能需要幾分鐘才能完成。您必須使用背景執行 (設定 background=true),以非同步方式執行代理程式並輪詢結果。詳情請參閱「處理長時間執行的工作」。

以下範例說明如何在背景啟動研究工作,並輪詢結果。

Python

import time
from google import genai

client = genai.Client()

interaction = client.interactions.create(
    input="Research the history of Google TPUs.",
    agent='deep-research-pro-preview-12-2025',
    background=True
)

print(f"Research started: {interaction.id}")

while True:
    interaction = client.interactions.get(interaction.id)
    if interaction.status == "completed":
        print(interaction.outputs[-1].text)
        break
    elif interaction.status == "failed":
        print(f"Research failed: {interaction.error}")
        break
    time.sleep(10)

JavaScript

import { GoogleGenAI } from '@google/genai';

const client = new GoogleGenAI({});

const interaction = await client.interactions.create({
    input: 'Research the history of Google TPUs.',
    agent: 'deep-research-pro-preview-12-2025',
    background: true
});

console.log(`Research started: ${interaction.id}`);

while (true) {
    const result = await client.interactions.get(interaction.id);
    if (result.status === 'completed') {
        console.log(result.outputs[result.outputs.length - 1].text);
        break;
    } else if (result.status === 'failed') {
        console.log(`Research failed: ${result.error}`);
        break;
    }
    await new Promise(resolve => setTimeout(resolve, 10000));
}

REST

# 1. Start the research task
curl -X POST "https://generativelanguage.googleapis.com/v1beta/interactions" \
-H "Content-Type: application/json" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-d '{
    "input": "Research the history of Google TPUs.",
    "agent": "deep-research-pro-preview-12-2025",
    "background": true
}'

# 2. Poll for results (Replace INTERACTION_ID)
# curl -X GET "https://generativelanguage.googleapis.com/v1beta/interactions/INTERACTION_ID" \
# -H "x-goog-api-key: $GEMINI_API_KEY"

以自有資料進行研究

Deep Research 可存取多種工具,根據預設,代理程式可以使用 google_searchurl_context 工具存取公開網際網路上的資訊。您不需要預設指定這些工具。不過,如果您想使用 File Search 工具,讓代理程式存取自己的資料,就必須按照下列範例新增該工具。

Python

import time
from google import genai

client = genai.Client()

interaction = client.interactions.create(
    input="Compare our 2025 fiscal year report against current public web news.",
    agent="deep-research-pro-preview-12-2025",
    background=True,
    tools=[
        {
            "type": "file_search",
            "file_search_store_names": ['fileSearchStores/my-store-name']
        }
    ]
)

JavaScript

const interaction = await client.interactions.create({
    input: 'Compare our 2025 fiscal year report against current public web news.',
    agent: 'deep-research-pro-preview-12-2025',
    background: true,
    tools: [
        { type: 'file_search', file_search_store_names: ['fileSearchStores/my-store-name'] },
    ]
});

REST

curl -X POST "https://generativelanguage.googleapis.com/v1beta/interactions" \
-H "Content-Type: application/json" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-d '{
    "input": "Compare our 2025 fiscal year report against current public web news.",
    "agent": "deep-research-pro-preview-12-2025",
    "background": true,
    "tools": [
        {"type": "file_search", "file_search_store_names": ["fileSearchStores/my-store-name"]},
    ]
}'

可操控性和格式

您可以在提示中提供具體的格式設定指示,引導代理程式輸出內容。您可以將報表劃分為特定章節和子章節、加入資料表,或針對不同目標對象調整語氣 (例如「technical」、「executive」、「casual」)。

在輸入文字中明確定義所需的輸出格式。

Python

prompt = """
Research the competitive landscape of EV batteries.

Format the output as a technical report with the following structure:
1. Executive Summary
2. Key Players (Must include a data table comparing capacity and chemistry)
3. Supply Chain Risks
"""

interaction = client.interactions.create(
    input=prompt,
    agent="deep-research-pro-preview-12-2025",
    background=True
)

JavaScript

const prompt = `
Research the competitive landscape of EV batteries.

Format the output as a technical report with the following structure:
1. Executive Summary
2. Key Players (Must include a data table comparing capacity and chemistry)
3. Supply Chain Risks
`;

const interaction = await client.interactions.create({
    input: prompt,
    agent: 'deep-research-pro-preview-12-2025',
    background: true,
});

REST

curl -X POST "https://generativelanguage.googleapis.com/v1beta/interactions" \
-H "Content-Type: application/json" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-d '{
    "input": "Research the competitive landscape of EV batteries.\n\nFormat the output as a technical report with the following structure: \n1. Executive Summary\n2. Key Players (Must include a data table comparing capacity and chemistry)\n3. Supply Chain Risks",
    "agent": "deep-research-pro-preview-12-2025",
    "background": true

多模態輸入內容

Deep Research 支援多模態輸入內容,包括圖片、PDF、音訊和影片,讓代理程式分析豐富的內容,然後根據提供的輸入內容進行網路研究。舉例來說,你可以提供相片,並要求代理程式辨識相片中的主體、研究其行為或尋找相關資訊。

以下範例示範如何使用圖片網址提出圖片分析要求。

Python

import time
from google import genai

client = genai.Client()

prompt = '''Analyze the interspecies dynamics and behavioral risks present
in the provided image of the African watering hole. Specifically, investigate
the symbiotic relationship between the avian species and the pachyderms
shown, and conduct a risk assessment for the reticulated giraffes based on
their drinking posture relative to the specific predator visible in the
foreground.'''

interaction = client.interactions.create(
    input=[
        {"type": "text", "text": prompt},
        {
            "type": "image",
            "uri": "https://storage.googleapis.com/generativeai-downloads/images/generated_elephants_giraffes_zebras_sunset.jpg"
        }
    ],
    agent="deep-research-pro-preview-12-2025",
    background=True
)

print(f"Research started: {interaction.id}")

while True:
    interaction = client.interactions.get(interaction.id)
    if interaction.status == "completed":
        print(interaction.outputs[-1].text)
        break
    elif interaction.status == "failed":
        print(f"Research failed: {interaction.error}")
        break
    time.sleep(10)

JavaScript

import { GoogleGenAI } from '@google/genai';

const client = new GoogleGenAI({});

const prompt = `Analyze the interspecies dynamics and behavioral risks present
in the provided image of the African watering hole. Specifically, investigate
the symbiotic relationship between the avian species and the pachyderms
shown, and conduct a risk assessment for the reticulated giraffes based on
their drinking posture relative to the specific predator visible in the
foreground.`;

const interaction = await client.interactions.create({
    input: [
        { type: 'text', text: prompt },
        {
            type: 'image',
            uri: 'https://storage.googleapis.com/generativeai-downloads/images/generated_elephants_giraffes_zebras_sunset.jpg'
        }
    ],
    agent: 'deep-research-pro-preview-12-2025',
    background: true
});

console.log(`Research started: ${interaction.id}`);

while (true) {
    const result = await client.interactions.get(interaction.id);
    if (result.status === 'completed') {
        console.log(result.outputs[result.outputs.length - 1].text);
        break;
    } else if (result.status === 'failed') {
        console.log(`Research failed: ${result.error}`);
        break;
    }
    await new Promise(resolve => setTimeout(resolve, 10000));
}

REST

# 1. Start the research task with image input
curl -X POST "https://generativelanguage.googleapis.com/v1beta/interactions" \
-H "Content-Type: application/json" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-d '{
    "input": [
        {"type": "text", "text": "Analyze the interspecies dynamics and behavioral risks present in the provided image of the African watering hole. Specifically, investigate the symbiotic relationship between the avian species and the pachyderms shown, and conduct a risk assessment for the reticulated giraffes based on their drinking posture relative to the specific predator visible in the foreground."},
        {"type": "image", "uri": "https://storage.googleapis.com/generativeai-downloads/images/generated_elephants_giraffes_zebras_sunset.jpg"}
    ],
    "agent": "deep-research-pro-preview-12-2025",
    "background": true
}'

# 2. Poll for results (Replace INTERACTION_ID)
# curl -X GET "https://generativelanguage.googleapis.com/v1beta/interactions/INTERACTION_ID" \
# -H "x-goog-api-key: $GEMINI_API_KEY"

處理長時間執行的工作

Deep Research 是一個多步驟程序,包括規劃、搜尋、閱讀和撰寫。這個週期通常會超過同步 API 呼叫的標準逾時限制。

服務專員必須使用 background=True。API 會立即傳回部分 Interaction 物件。您可以使用 id 屬性擷取輪詢的互動。互動狀態會從 in_progress 轉換為 completedfailed

串流

Deep Research 支援串流功能,可即時接收研究進度更新。您必須設定 stream=Truebackground=True

以下範例說明如何啟動研究工作及處理串流。 最重要的是,這個範例會示範如何從 interaction.start 事件追蹤 interaction_id。如果發生網路中斷,您需要使用這個 ID 才能繼續串流。這段程式碼也會導入 event_id 變數,方便您從中斷連線的特定時間點繼續作業。

Python

stream = client.interactions.create(
    input="Research the history of Google TPUs.",
    agent="deep-research-pro-preview-12-2025",
    background=True,
    stream=True,
    agent_config={
        "type": "deep-research",
        "thinking_summaries": "auto"
    }
)

interaction_id = None
last_event_id = None

for chunk in stream:
    if chunk.event_type == "interaction.start":
        interaction_id = chunk.interaction.id
        print(f"Interaction started: {interaction_id}")

    if chunk.event_id:
        last_event_id = chunk.event_id

    if chunk.event_type == "content.delta":
        if chunk.delta.type == "text":
            print(chunk.delta.text, end="", flush=True)
        elif chunk.delta.type == "thought_summary":
            print(f"Thought: {chunk.delta.content.text}", flush=True)

    elif chunk.event_type == "interaction.complete":
        print("\nResearch Complete")

JavaScript

const stream = await client.interactions.create({
    input: 'Research the history of Google TPUs.',
    agent: 'deep-research-pro-preview-12-2025',
    background: true,
    stream: true,
    agent_config: {
        type: 'deep-research',
        thinking_summaries: 'auto'
    }
});

let interactionId;
let lastEventId;

for await (const chunk of stream) {
    // 1. Capture Interaction ID
    if (chunk.event_type === 'interaction.start') {
        interactionId = chunk.interaction.id;
        console.log(`Interaction started: ${interactionId}`);
    }

    // 2. Track IDs for potential reconnection
    if (chunk.event_id) lastEventId = chunk.event_id;

    // 3. Handle Content
    if (chunk.event_type === 'content.delta') {
        if (chunk.delta.type === 'text') {
            process.stdout.write(chunk.delta.text);
        } else if (chunk.delta.type === 'thought_summary') {
            console.log(`Thought: ${chunk.delta.content.text}`);
        }
    } else if (chunk.event_type === 'interaction.complete') {
        console.log('\nResearch Complete');
    }
}

REST

curl -X POST "https://generativelanguage.googleapis.com/v1beta/interactions?alt=sse" \
-H "Content-Type: application/json" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-d '{
    "input": "Research the history of Google TPUs.",
    "agent": "deep-research-pro-preview-12-2025",
    "background": true,
    "stream": true,
    "agent_config": {
        "type": "deep-research",
        "thinking_summaries": "auto"
    }
}'
# Note: Look for the 'interaction.start' event to get the interaction ID.

重新連線至串流

長時間執行研究工作時,可能會發生網路中斷情形。如要妥善處理這項問題,應用程式應擷取連線錯誤,並使用 client.interactions.get() 繼續串流。

如要繼續,您必須提供兩個值:

  1. 互動 ID:從初始串流中的 interaction.start 事件取得。
  2. 最後一個事件 ID:最後一個成功處理的事件 ID。這會告知伺服器在該特定時間點之後繼續傳送事件。如未提供,系統會傳回串流的開頭。

以下範例示範彈性模式:嘗試串流初始 create 要求,並在連線中斷時回溯至 get 迴圈。

Python

import time
from google import genai

client = genai.Client()

# Configuration
agent_name = 'deep-research-pro-preview-12-2025'
prompt = 'Compare golang SDK test frameworks'

# State tracking
last_event_id = None
interaction_id = None
is_complete = False

def process_stream(event_stream):
    """Helper to process events from any stream source."""
    global last_event_id, interaction_id, is_complete
    for event in event_stream:
        # Capture Interaction ID
        if event.event_type == "interaction.start":
            interaction_id = event.interaction.id
            print(f"Interaction started: {interaction_id}")

        # Capture Event ID
        if event.event_id:
            last_event_id = event.event_id

        # Print content
        if event.event_type == "content.delta":
            if event.delta.type == "text":
                print(event.delta.text, end="", flush=True)
            elif event.delta.type == "thought_summary":
                print(f"Thought: {event.delta.content.text}", flush=True)

        # Check completion
        if event.event_type in ['interaction.complete', 'error']:
            is_complete = True

# 1. Attempt initial streaming request
try:
    print("Starting Research...")
    initial_stream = client.interactions.create(
        input=prompt,
        agent=agent_name,
        background=True,
        stream=True,
        agent_config={
            "type": "deep-research",
            "thinking_summaries": "auto"
        }
    )
    process_stream(initial_stream)
except Exception as e:
    print(f"\nInitial connection dropped: {e}")

# 2. Reconnection Loop
# If the code reaches here and is_complete is False, we resume using .get()
while not is_complete and interaction_id:
    print(f"\nConnection lost. Resuming from event {last_event_id}...")
    time.sleep(2) 

    try:
        resume_stream = client.interactions.get(
            id=interaction_id,
            stream=True,
            last_event_id=last_event_id
        )
        process_stream(resume_stream)
    except Exception as e:
        print(f"Reconnection failed, retrying... ({e})")

JavaScript

let lastEventId;
let interactionId;
let isComplete = false;

// Helper to handle the event logic
const handleStream = async (stream) => {
    for await (const chunk of stream) {
        if (chunk.event_type === 'interaction.start') {
            interactionId = chunk.interaction.id;
        }
        if (chunk.event_id) lastEventId = chunk.event_id;

        if (chunk.event_type === 'content.delta') {
            if (chunk.delta.type === 'text') {
                process.stdout.write(chunk.delta.text);
            } else if (chunk.delta.type === 'thought_summary') {
                console.log(`Thought: ${chunk.delta.content.text}`);
            }
        } else if (chunk.event_type === 'interaction.complete') {
            isComplete = true;
        }
    }
};

// 1. Start the task with streaming
try {
    const stream = await client.interactions.create({
        input: 'Compare golang SDK test frameworks',
        agent: 'deep-research-pro-preview-12-2025',
        background: true,
        stream: true,
        agent_config: {
            type: 'deep-research',
            thinking_summaries: 'auto'
        }
    });
    await handleStream(stream);
} catch (e) {
    console.log('\nInitial stream interrupted.');
}

// 2. Reconnect Loop
while (!isComplete && interactionId) {
    console.log(`\nReconnecting to interaction ${interactionId} from event ${lastEventId}...`);
    try {
        const stream = await client.interactions.get(interactionId, {
            stream: true,
            last_event_id: lastEventId
        });
        await handleStream(stream);
    } catch (e) {
        console.log('Reconnection failed, retrying in 2s...');
        await new Promise(resolve => setTimeout(resolve, 2000));
    }
}

REST

# 1. Start the research task (Initial Stream)
# Watch for event: interaction.start to get the INTERACTION_ID
# Watch for "event_id" fields to get the LAST_EVENT_ID
curl -X POST "https://generativelanguage.googleapis.com/v1beta/interactions?alt=sse" \
-H "Content-Type: application/json" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-d '{
    "input": "Compare golang SDK test frameworks",
    "agent": "deep-research-pro-preview-12-2025",
    "background": true,
    "stream": true,
    "agent_config": {
        "type": "deep-research",
        "thinking_summaries": "auto"
    }
}'

# ... Connection interrupted ...

# 2. Reconnect (Resume Stream)
# Pass the INTERACTION_ID and the LAST_EVENT_ID you saved.
curl -X GET "https://generativelanguage.googleapis.com/v1beta/interactions/INTERACTION_ID?stream=true&last_event_id=LAST_EVENT_ID&alt=sse" \
-H "x-goog-api-key: $GEMINI_API_KEY"

後續問題和互動

代理程式傳回最終報告後,你可以使用 previous_interaction_id 繼續對話。這樣一來,您就能要求釐清、總結或詳細說明研究的特定部分,不必重新啟動整個工作。

Python

import time
from google import genai

client = genai.Client()

interaction = client.interactions.create(
    input="Can you elaborate on the second point in the report?",
    model="gemini-3-pro-preview",
    previous_interaction_id="COMPLETED_INTERACTION_ID"
)

print(interaction.outputs[-1].text)

JavaScript

const interaction = await client.interactions.create({
    input: 'Can you elaborate on the second point in the report?',
    agent: 'deep-research-pro-preview-12-2025',
    previous_interaction_id: 'COMPLETED_INTERACTION_ID'
});
console.log(interaction.outputs[-1].text);

REST

curl -X POST "https://generativelanguage.googleapis.com/v1beta/interactions" \
-H "Content-Type: application/json" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-d '{
    "input": "Can you elaborate on the second point in the report?",
    "agent": "deep-research-pro-preview-12-2025",
    "previous_interaction_id": "COMPLETED_INTERACTION_ID"
}'

Gemini Deep Research 代理程式的使用時機

Deep Research 是代理程式,而不只是模型。這項功能最適合需要「分析師即時服務」方法的工作負載,而非低延遲的即時通訊。

功能 標準 Gemini 模型 Gemini Deep Research Agent
延遲 分鐘 (非同步/背景)
流程 生成 -> 輸出內容 規劃 -> 搜尋 -> 閱讀 -> 疊代 -> 輸出
輸出內容 對話文字、程式碼、簡短摘要 詳細報表、長篇分析、比較表
適用情境 聊天機器人、擷取、創意寫作 市場分析、盡職調查、文獻回顧、競爭環境

適用情形與定價

您可以使用 Google AI Studio 和 Gemini API 中的 Interactions API,存取 Gemini Deep Research Agent。

價格採用即付即用模式,依據基礎 Gemini 3 Pro 模型和代理使用的特定工具計算。標準的即時通訊要求會產生一個輸出內容,但「深入研究」工作是代理工作流程。單一要求會觸發自主迴圈,進行規劃、搜尋、閱讀和推理。

預估費用

費用會因所需研究深度而異。代理程式會自主判斷回答提示需要多少閱讀和搜尋量。

  • 標準研究工作:對於需要中等程度分析的典型查詢,代理程式可能會使用約 80 個搜尋查詢、約 25 萬個輸入符記 (約 50% 至 70% 的快取),以及約 6 萬個輸出符記。
    • 預估總金額:每項工作約$2.00 美元至 $3.00 美元
  • 複雜的研究工作:如要深入分析競爭環境或進行詳盡的盡職調查,代理程式最多可能使用約 160 個搜尋查詢、約 90 萬個輸入權杖 (約 50% 至 70% 為快取) 和約 8 萬個輸出權杖。
    • 預估總金額:每項工作約$3.00 美元至 $5.00 美元

安全考量

授予代理程式網路和私人檔案的存取權時,請務必仔細考量安全風險。

  • 使用檔案進行提示注入:代理程式會讀取您提供的檔案內容。請確認上傳的文件 (PDF、文字檔) 來自可信來源。惡意檔案可能含有隱藏文字,用於操縱代理程式的輸出內容。
  • 網路內容風險:代理程式會搜尋公開網路,雖然我們導入了強大的安全篩選器,但代理程式仍可能遇到並處理惡意網頁。建議您查看回覆中提供的citations,確認來源是否正確。
  • 資料外洩:如果允許 Agent 瀏覽網頁,要求 Agent 摘要說明機密內部資料時請務必謹慎。

最佳做法

  • 提示不明事項:指示服務專員如何處理遺漏的資料。 舉例來說,在提示中加入「如果無法取得 2025 年的具體數據,請明確指出這些是預測或無法取得,而非估算」
  • 提供脈絡:直接在輸入提示中提供背景資訊或限制,做為代理程式研究的基準。
  • 多模態輸入內容:Deep Research 代理支援多模態輸入內容。 請謹慎使用,因為這會增加成本,並可能導致脈絡視窗溢位。

限制

  • Beta 版狀態:Interactions API 目前為公開 Beta 版。功能和結構定義可能會變更。
  • 自訂工具:目前無法為 Deep Research 代理提供自訂函式呼叫工具或遠端 MCP (Model Context Protocol) 伺服器。
  • 結構化輸出內容和計畫核准:Deep Research Agent 目前不支援人工核准的計畫或結構化輸出內容。
  • 研究時間上限:Deep Research 代理的研究時間上限為 60 分鐘。大多數工作應可在 20 分鐘內完成。
  • 商店規定:使用 background=True 執行代理程式時,必須提供 store=True
  • Google 搜尋: Google 搜尋預設為啟用,且特定限制適用於基礎搜尋結果。
  • 音訊輸入:不支援音訊輸入。

後續步驟