Interactions API

Interactions API 提供統一的介面,可與 Gemini 模型和代理程式互動。這項功能可簡化狀態管理、工具協調和長時間執行的工作。如要全面瞭解 API 結構定義,請參閱 API 參考資料

以下範例說明如何使用文字提示呼叫 Interactions API。

Python

from google import genai

client = genai.Client()

interaction =  client.interactions.create(
    model="gemini-3-pro-preview",
    input="Tell me a short joke about programming."
)

print(interaction.outputs[-1].text)

JavaScript

import { GoogleGenAI } from '@google/genai';

const client = new GoogleGenAI({});

const interaction =  await client.interactions.create({
    model: 'gemini-3-pro-preview',
    input: 'Tell me a short joke about programming.',
});

console.log(interaction.outputs[interaction.outputs.length - 1].text);

REST

curl -X POST "https://generativelanguage.googleapis.com/v1beta/interactions" \
-H "Content-Type: application/json" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-d '{
    "model": "gemini-3-pro-preview",
    "input": "Tell me a short joke about programming."
}'

基本互動

您可以透過現有 SDK 使用 Interactions API。與模型互動最簡單的方式就是提供文字提示。input 可以是字串、包含內容物件的清單,或是包含角色和內容物件的回合清單。

Python

from google import genai

client = genai.Client()

interaction =  client.interactions.create(
    model="gemini-2.5-flash",
    input="Tell me a short joke about programming."
)

print(interaction.outputs[-1].text)

JavaScript

import { GoogleGenAI } from '@google/genai';

const client = new GoogleGenAI({});

const interaction =  await client.interactions.create({
    model: 'gemini-2.5-flash',
    input: 'Tell me a short joke about programming.',
});

console.log(interaction.outputs[interaction.outputs.length - 1].text);

REST

curl -X POST "https://generativelanguage.googleapis.com/v1beta/interactions" \
-H "Content-Type: application/json" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-d '{
    "model": "gemini-2.5-flash",
    "input": "Tell me a short joke about programming."
}'

對話

您可以透過兩種方式建立多輪對話:

  • 有狀態,可參照先前的互動
  • 無狀態,方法是提供完整對話記錄

有狀態的對話

將先前互動的 id 傳遞至 previous_interaction_id 參數,繼續對話。

Python

from google import genai

client = genai.Client()

# 1. First turn
interaction1 = client.interactions.create(
    model="gemini-2.5-flash",
    input="Hi, my name is Phil."
)
print(f"Model: {interaction1.outputs[-1].text}")

# 2. Second turn (passing previous_interaction_id)
interaction2 = client.interactions.create(
    model="gemini-2.5-flash",
    input="What is my name?",
    previous_interaction_id=interaction1.id
)
print(f"Model: {interaction2.outputs[-1].text}")

JavaScript

import { GoogleGenAI } from '@google/genai';

const client = new GoogleGenAI({});

// 1. First turn
const interaction1 = await client.interactions.create({
    model: 'gemini-2.5-flash',
    input: 'Hi, my name is Phil.'
});
console.log(`Model: ${interaction1.outputs[interaction1.outputs.length - 1].text}`);

// 2. Second turn (passing previous_interaction_id)
const interaction2 = await client.interactions.create({
    model: 'gemini-2.5-flash',
    input: 'What is my name?',
    previous_interaction_id: interaction1.id
});
console.log(`Model: ${interaction2.outputs[interaction2.outputs.length - 1].text}`);

REST

# 1. First turn
curl -X POST "https://generativelanguage.googleapis.com/v1beta/interactions" \
-H "Content-Type: application/json" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-d '{
    "model": "gemini-2.5-flash",
    "input": "Hi, my name is Phil."
}'

# 2. Second turn (Replace INTERACTION_ID with the ID from the previous interaction)
# curl -X POST "https://generativelanguage.googleapis.com/v1beta/interactions" \
# -H "Content-Type: application/json" \
# -H "x-goog-api-key: $GEMINI_API_KEY" \
# -d '{
#     "model": "gemini-2.5-flash",
#     "input": "What is my name?",
#     "previous_interaction_id": "INTERACTION_ID"
# }'

擷取先前的有狀態互動

使用互動 id 擷取先前的對話回合。

Python

previous_interaction = client.interactions.get("<YOUR_INTERACTION_ID>")

print(previous_interaction)

JavaScript

const previous_interaction = await client.interactions.get("<YOUR_INTERACTION_ID>");
console.log(previous_interaction);

REST

curl -X GET "https://generativelanguage.googleapis.com/v1beta/interactions/<YOUR_INTERACTION_ID>" \
-H "x-goog-api-key: $GEMINI_API_KEY"

無狀態對話

您可以在用戶端手動管理對話記錄。

Python

from google import genai

client = genai.Client()

conversation_history = [
    {
        "role": "user",
        "content": "What are the three largest cities in Spain?"
    }
]

interaction1 = client.interactions.create(
    model="gemini-2.5-flash",
    input=conversation_history
)

print(f"Model: {interaction1.outputs[-1].text}")

conversation_history.append({"role": "model", "content": interaction1.outputs})
conversation_history.append({
    "role": "user", 
    "content": "What is the most famous landmark in the second one?"
})

interaction2 = client.interactions.create(
    model="gemini-2.5-flash",
    input=conversation_history
)

print(f"Model: {interaction2.outputs[-1].text}")

JavaScript

import { GoogleGenAI } from '@google/genai';

const client = new GoogleGenAI({});

const conversationHistory = [
    {
        role: 'user',
        content: "What are the three largest cities in Spain?"
    }
];

const interaction1 = await client.interactions.create({
    model: 'gemini-2.5-flash',
    input: conversationHistory
});

console.log(`Model: ${interaction1.outputs[interaction1.outputs.length - 1].text}`);

conversationHistory.push({ role: 'model', content: interaction1.outputs });
conversationHistory.push({
    role: 'user',
    content: "What is the most famous landmark in the second one?"
});

const interaction2 = await client.interactions.create({
    model: 'gemini-2.5-flash',
    input: conversationHistory
});

console.log(`Model: ${interaction2.outputs[interaction2.outputs.length - 1].text}`);

REST

 curl -X POST "https://generativelanguage.googleapis.com/v1beta/interactions" \
 -H "Content-Type: application/json" \
 -H "x-goog-api-key: $GEMINI_API_KEY" \
 -d '{
    "model": "gemini-2.5-flash",
    "input": [
        {
            "role": "user",
            "content": "What are the three largest cities in Spain?"
        },
        {
            "role": "model",
            "content": "The three largest cities in Spain are Madrid, Barcelona, and Valencia."
        },
        {
            "role": "user",
            "content": "What is the most famous landmark in the second one?"
        }
    ]
}'

多模態功能

您可以將 Interactions API 用於多模態用途,例如圖片理解或影片生成。

多模態理解

您可以內嵌 base64 編碼資料,或使用 Files API 處理較大的檔案,提供多模態資料。

圖像解讀

Python

import base64
from pathlib import Path
from google import genai

client = genai.Client()

# Read and encode the image
with open(Path(__file__).parent / "car.png", "rb") as f:
    base64_image = base64.b64encode(f.read()).decode('utf-8')

interaction = client.interactions.create(
    model="gemini-2.5-flash",
    input=[
        {"type": "text", "text": "Describe the image."},
        {"type": "image", "data": base64_image, "mime_type": "image/png"}
    ]
)

print(interaction.outputs[-1].text)

JavaScript

import { GoogleGenAI } from '@google/genai';
import * as fs from 'fs';

const client = new GoogleGenAI({});

const base64Image = fs.readFileSync('car.png', { encoding: 'base64' });

const interaction = await client.interactions.create({
    model: 'gemini-2.5-flash',
    input: [
        { type: 'text', text: 'Describe the image.' },
        { type: 'image', data: base64Image, mime_type: 'image/png' }
    ]
});

console.log(interaction.outputs[interaction.outputs.length - 1].text);

REST

curl -X POST "https://generativelanguage.googleapis.com/v1beta/interactions" \
-H "Content-Type: application/json" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-d '{
    "model": "gemini-2.5-flash",
    "input": [
        {"type": "text", "text": "Describe the image."},
        {"type": "image", "data": "'"$(base64 -w0 car.png)"'", "mime_type": "image/png"}
    ]
}'

音訊理解

Python

import base64
from pathlib import Path
from google import genai

client = genai.Client()

# Read and encode the audio
with open(Path(__file__).parent / "speech.wav", "rb") as f:
    base64_audio = base64.b64encode(f.read()).decode('utf-8')

interaction = client.interactions.create(
    model="gemini-2.5-flash",
    input=[
        {"type": "text", "text": "What does this audio say?"},
        {"type": "audio", "data": base64_audio, "mime_type": "audio/wav"}
    ]
)

print(interaction.outputs[-1].text)

JavaScript

import { GoogleGenAI } from '@google/genai';
import * as fs from 'fs';

const client = new GoogleGenAI({});

const base64Audio = fs.readFileSync('speech.wav', { encoding: 'base64' });

const interaction = await client.interactions.create({
    model: 'gemini-2.5-flash',
    input: [
        { type: 'text', text: 'What does this audio say?' },
        { type: 'audio', data: base64Audio, mime_type: 'audio/wav' }
    ]
});

console.log(interaction.outputs[interaction.outputs.length - 1].text);

REST

curl -X POST "https://generativelanguage.googleapis.com/v1beta/interactions" \
-H "Content-Type: application/json" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-d '{
    "model": "gemini-2.5-flash",
    "input": [
        {"type": "text", "text": "What does this audio say?"},
        {"type": "audio", "data": "'"$(base64 -w0 speech.wav)"'", "mime_type": "audio/wav"}
    ]
}'

影片解讀

Python

import base64
from pathlib import Path
from google import genai

client = genai.Client()

# Read and encode the video
with open(Path(__file__).parent / "video.mp4", "rb") as f:
    base64_video = base64.b64encode(f.read()).decode('utf-8')

print("Analyzing video...")
interaction = client.interactions.create(
    model="gemini-2.5-flash",
    input=[
        {"type": "text", "text": "What is happening in this video? Provide a timestamped summary."},
        {"type": "video", "data": base64_video, "mime_type": "video/mp4" }
    ]
)

print(interaction.outputs[-1].text)

JavaScript

import { GoogleGenAI } from '@google/genai';
import * as fs from 'fs';

const client = new GoogleGenAI({});

const base64Video = fs.readFileSync('video.mp4', { encoding: 'base64' });

console.log('Analyzing video...');
const interaction = await client.interactions.create({
    model: 'gemini-2.5-flash',
    input: [
        { type: 'text', text: 'What is happening in this video? Provide a timestamped summary.' },
        { type: 'video', data: base64Video, mime_type: 'video/mp4'}
    ]
});

console.log(interaction.outputs[interaction.outputs.length - 1].text);

REST

curl -X POST "https://generativelanguage.googleapis.com/v1beta/interactions" \
-H "Content-Type: application/json" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-d '{
    "model": "gemini-2.5-flash",
    "input": [
        {"type": "text", "text": "What is happening in this video?"},
        {"type": "video", "mime_type": "video/mp4", "data": "'"$(base64 -w0 video.mp4)"'"}
    ]
}'

解讀文件 (PDF)

Python

import base64
from google import genai

client = genai.Client()

with open("sample.pdf", "rb") as f:
    base64_pdf = base64.b64encode(f.read()).decode('utf-8')

interaction = client.interactions.create(
    model="gemini-2.5-flash",
    input=[
        {"type": "text", "text": "What is this document about?"},
        {"type": "document", "data": base64_pdf, "mime_type": "application/pdf"}
    ]
)
print(interaction.outputs[-1].text)

JavaScript

import { GoogleGenAI } from '@google/genai';
import * as fs from 'fs';
const client = new GoogleGenAI({});

const base64Pdf = fs.readFileSync('sample.pdf', { encoding: 'base64' });

const interaction = await client.interactions.create({
    model: 'gemini-2.5-flash',
    input: [
        { type: 'text', text: 'What is this document about?' },
        { type: 'document', data: base64Pdf, mime_type: 'application/pdf' }
    ],
});
console.log(interaction.outputs[0].text);

REST

curl -X POST "https://generativelanguage.googleapis.com/v1beta/interactions" \
-H "Content-Type: application/json" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-d '{
    "model": "gemini-2.5-flash",
    "input": [
        {"type": "text", "text": "What is this document about?"},
        {"type": "document", "data": "'"$(base64 -w0 sample.pdf)"'", "mime_type": "application/pdf"}
    ]
}'

多模態生成

您可以使用 Interactions API 生成多模態輸出內容。

圖像生成

Python

import base64
from google import genai

client = genai.Client()

interaction = client.interactions.create(
    model="gemini-3-pro-image-preview",
    input="Generate an image of a futuristic city.",
    response_modalities=["IMAGE"]
)

for output in interaction.outputs:
    if output.type == "image":
        print(f"Generated image with mime_type: {output.mime_type}")
        # Save the image
        with open("generated_city.png", "wb") as f:
            f.write(base64.b64decode(output.data))

JavaScript

import { GoogleGenAI } from '@google/genai';
import * as fs from 'fs';

const client = new GoogleGenAI({});

const interaction = await client.interactions.create({
    model: 'gemini-3-pro-image-preview',
    input: 'Generate an image of a futuristic city.',
    response_modalities: ['IMAGE']
});

for (const output of interaction.outputs) {
    if (output.type === 'image') {
        console.log(`Generated image with mime_type: ${output.mime_type}`);
        // Save the image
        fs.writeFileSync('generated_city.png', Buffer.from(output.data, 'base64'));
    }
}

REST

curl -X POST "https://generativelanguage.googleapis.com/v1beta/interactions" \
-H "Content-Type: application/json" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-d '{
    "model": "gemini-3-pro-image-preview",
    "input": "Generate an image of a futuristic city.",
    "response_modalities": ["IMAGE"]
}'

代理能力

Interactions API 專為建構及與代理互動而設計,支援函式呼叫、內建工具、結構化輸出內容和 Model Context Protocol (MCP)。

代理

您可以針對複雜工作使用 deep-research-pro-preview-12-2025 等專用代理程式。如要進一步瞭解 Gemini Deep Research 代理程式,請參閱「Deep Research」指南。

Python

import time
from google import genai

client = genai.Client()

# 1. Start the Deep Research Agent
initial_interaction = client.interactions.create(
    input="Research the history of the Google TPUs with a focus on 2025 and 2026.",
    agent="deep-research-pro-preview-12-2025",
    background=True
)

print(f"Research started. Interaction ID: {initial_interaction.id}")

# 2. Poll for results
while True:
    interaction = client.interactions.get(initial_interaction.id)
    print(f"Status: {interaction.status}")

    if interaction.status == "completed":
        print("\nFinal Report:\n", interaction.outputs[-1].text)
        break
    elif interaction.status in ["failed", "cancelled"]:
        print(f"Failed with status: {interaction.status}")
        break

    time.sleep(10)

JavaScript

import { GoogleGenAI } from '@google/genai';

const client = new GoogleGenAI({});

// 1. Start the Deep Research Agent
const initialInteraction = await client.interactions.create({
    input: 'Research the history of the Google TPUs with a focus on 2025 and 2026.',
    agent: 'deep-research-pro-preview-12-2025',
    background: true
});

console.log(`Research started. Interaction ID: ${initialInteraction.id}`);

// 2. Poll for results
while (true) {
    const interaction = await client.interactions.get(initialInteraction.id);
    console.log(`Status: ${interaction.status}`);

    if (interaction.status === 'completed') {
        console.log('\nFinal Report:\n', interaction.outputs[interaction.outputs.length - 1].text);
        break;
    } else if (['failed', 'cancelled'].includes(interaction.status)) {
        console.log(`Failed with status: ${interaction.status}`);
        break;
    }

    await new Promise(resolve => setTimeout(resolve, 10000));
}

REST

# 1. Start the Deep Research Agent
curl -X POST "https://generativelanguage.googleapis.com/v1beta/interactions" \
-H "Content-Type: application/json" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-d '{
    "input": "Research the history of the Google TPUs with a focus on 2025 and 2026.",
    "agent": "deep-research-pro-preview-12-2025",
    "background": true
}'

# 2. Poll for results (Replace INTERACTION_ID with the ID from the previous interaction)
# curl -X GET "https://generativelanguage.googleapis.com/v1beta/interactions/INTERACTION_ID" \
# -H "x-goog-api-key: $GEMINI_API_KEY"

工具和函式呼叫

本節說明如何使用函式呼叫定義自訂工具,以及如何在 Interactions API 中使用 Google 內建工具。

函式呼叫

Python

from google import genai

client = genai.Client()

# 1. Define the tool
def get_weather(location: str):
    """Gets the weather for a given location."""
    return f"The weather in {location} is sunny."

weather_tool = {
    "type": "function",
    "name": "get_weather",
    "description": "Gets the weather for a given location.",
    "parameters": {
        "type": "object",
        "properties": {
            "location": {"type": "string", "description": "The city and state, e.g. San Francisco, CA"}
        },
        "required": ["location"]
    }
}

# 2. Send the request with tools
interaction = client.interactions.create(
    model="gemini-2.5-flash",
    input="What is the weather in Paris?",
    tools=[weather_tool]
)

# 3. Handle the tool call
for output in interaction.outputs:
    if output.type == "function_call":
        print(f"Tool Call: {output.name}({output.arguments})")
        # Execute tool
        result = get_weather(**output.arguments)

        # Send result back
        interaction = client.interactions.create(
            model="gemini-2.5-flash",
            previous_interaction_id=interaction.id,
            input=[{
                "type": "function_result",
                "name": output.name,
                "call_id": output.id,
                "result": result
            }]
        )
        print(f"Response: {interaction.outputs[-1].text}")

JavaScript

import { GoogleGenAI } from '@google/genai';

const client = new GoogleGenAI({});

// 1. Define the tool
const weatherTool = {
    type: 'function',
    name: 'get_weather',
    description: 'Gets the weather for a given location.',
    parameters: {
        type: 'object',
        properties: {
            location: { type: 'string', description: 'The city and state, e.g. San Francisco, CA' }
        },
        required: ['location']
    }
};

// 2. Send the request with tools
let interaction = await client.interactions.create({
    model: 'gemini-2.5-flash',
    input: 'What is the weather in Paris?',
    tools: [weatherTool]
});

// 3. Handle the tool call
for (const output of interaction.outputs) {
    if (output.type === 'function_call') {
        console.log(`Tool Call: ${output.name}(${JSON.stringify(output.arguments)})`);

        // Execute tool (Mocked)
        const result = `The weather in ${output.arguments.location} is sunny.`;

        // Send result back
        interaction = await client.interactions.create({
            model: 'gemini-2.5-flash',
            previous_interaction_id: interaction.id,
            input: [{
                type: 'function_result',
                name: output.name,
                call_id: output.id,
                result: result
            }]
        });
        console.log(`Response: ${interaction.outputs[interaction.outputs.length - 1].text}`);
    }
}

REST

curl -X POST "https://generativelanguage.googleapis.com/v1beta/interactions" \
-H "Content-Type: application/json" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-d '{
    "model": "gemini-2.5-flash",
    "input": "What is the weather in Paris?",
    "tools": [{
        "type": "function",
        "name": "get_weather",
        "description": "Gets the weather for a given location.",
        "parameters": {
            "type": "object",
            "properties": {
                "location": {"type": "string", "description": "The city and state, e.g. San Francisco, CA"}
            },
            "required": ["location"]
        }
    }]
}'

# Handle the tool call and send result back (Replace INTERACTION_ID and CALL_ID)
# curl -X POST "https://generativelanguage.googleapis.com/v1beta/interactions" \
# -H "Content-Type: application/json" \
# -H "x-goog-api-key: $GEMINI_API_KEY" \
# -d '{
#     "model": "gemini-2.5-flash",
#     "previous_interaction_id": "INTERACTION_ID",
#     "input": [{
#         "type": "function_result",
#         "name": "get_weather",
#         "call_id": "FUNCTION_CALL_ID",
#         "result": "The weather in Paris is sunny."
#     }]
# }'
使用用戶端狀態呼叫函式

如果不想使用伺服器端狀態,可以在用戶端管理所有狀態。

Python

from google import genai
client = genai.Client()

functions = [
    {
        "type": "function",
        "name": "schedule_meeting",
        "description": "Schedules a meeting with specified attendees at a given time and date.",
        "parameters": {
            "type": "object",
            "properties": {
                "attendees": {"type": "array", "items": {"type": "string"}},
                "date": {"type": "string", "description": "Date of the meeting (e.g., 2024-07-29)"},
                "time": {"type": "string", "description": "Time of the meeting (e.g., 15:00)"},
                "topic": {"type": "string", "description": "The subject of the meeting."},
            },
            "required": ["attendees", "date", "time", "topic"],
        },
    }
]

history = [{"role": "user","content": [{"type": "text", "text": "Schedule a meeting for 2025-11-01 at 10 am with Peter and Amir about the Next Gen API."}]}]

# 1. Model decides to call the function
interaction = client.interactions.create(
    model="gemini-2.5-flash",
    input=history,
    tools=functions
)

# add model interaction back to history
history.append({"role": "model", "content": interaction.outputs})

for output in interaction.outputs:
    if output.type == "function_call":
        print(f"Function call: {output.name} with arguments {output.arguments}")

        # 2. Execute the function and get a result
        # In a real app, you would call your function here.
        # call_result = schedule_meeting(**json.loads(output.arguments))
        call_result = "Meeting scheduled successfully."

        # 3. Send the result back to the model
        history.append({"role": "user", "content": [{"type": "function_result", "name": output.name, "call_id": output.id, "result": call_result}]})

        interaction2 = client.interactions.create(
            model="gemini-2.5-flash",
            input=history,
        )
        print(f"Final response: {interaction2.outputs[-1].text}")
    else:
        print(f"Output: {output}")

JavaScript

// 1. Define the tool
const functions = [
    {
        type: 'function',
        name: 'schedule_meeting',
        description: 'Schedules a meeting with specified attendees at a given time and date.',
        parameters: {
            type: 'object',
            properties: {
                attendees: { type: 'array', items: { type: 'string' } },
                date: { type: 'string', description: 'Date of the meeting (e.g., 2024-07-29)' },
                time: { type: 'string', description: 'Time of the meeting (e.g., 15:00)' },
                topic: { type: 'string', description: 'The subject of the meeting.' },
            },
            required: ['attendees', 'date', 'time', 'topic'],
        },
    },
];

const history = [
    { role: 'user', content: [{ type: 'text', text: 'Schedule a meeting for 2025-11-01 at 10 am with Peter and Amir about the Next Gen API.' }] }
];

// 2. Model decides to call the function
let interaction = await client.interactions.create({
    model: 'gemini-2.5-flash',
    input: history,
    tools: functions
});

// add model interaction back to history
history.push({ role: 'model', content: interaction.outputs });

for (const output of interaction.outputs) {
    if (output.type === 'function_call') {
        console.log(`Function call: ${output.name} with arguments ${JSON.stringify(output.arguments)}`);

        // 3. Send the result back to the model
        history.push({ role: 'user', content: [{ type: 'function_result', name: output.name, call_id: output.id, result: 'Meeting scheduled successfully.' }] });

        const interaction2 = await client.interactions.create({
            model: 'gemini-2.5-flash',
            input: history,
        });
        console.log(`Final response: ${interaction2.outputs[interaction2.outputs.length - 1].text}`);
    }
}

內建工具

Gemini 內建多種工具,例如以 Google 搜尋為基礎程式碼執行網址情境

Python

from google import genai

client = genai.Client()

interaction = client.interactions.create(
    model="gemini-2.5-flash",
    input="Who won the last Super Bowl?",
    tools=[{"type": "google_search"}]
)
# Find the text output (not the GoogleSearchResultContent)
text_output = next((o for o in interaction.outputs if o.type == "text"), None)
if text_output:
    print(text_output.text)

JavaScript

import { GoogleGenAI } from '@google/genai';

const client = new GoogleGenAI({});

const interaction = await client.interactions.create({
    model: 'gemini-2.5-flash',
    input: 'Who won the last Super Bowl?',
    tools: [{ type: 'google_search' }]
});
// Find the text output (not the GoogleSearchResultContent)
const textOutput = interaction.outputs.find(o => o.type === 'text');
if (textOutput) console.log(textOutput.text);

REST

curl -X POST "https://generativelanguage.googleapis.com/v1beta/interactions" \
-H "Content-Type: application/json" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-d '{
    "model": "gemini-2.5-flash",
    "input": "Who won the last Super Bowl?",
    "tools": [{"type": "google_search"}]
}'
程式碼執行

Python

from google import genai

client = genai.Client()

interaction = client.interactions.create(
    model="gemini-2.5-flash",
    input="Calculate the 50th Fibonacci number.",
    tools=[{"type": "code_execution"}]
)
print(interaction.outputs[-1].text)

JavaScript

import { GoogleGenAI } from '@google/genai';

const client = new GoogleGenAI({});

const interaction = await client.interactions.create({
    model: 'gemini-2.5-flash',
    input: 'Calculate the 50th Fibonacci number.',
    tools: [{ type: 'code_execution' }]
});
console.log(interaction.outputs[interaction.outputs.length - 1].text);

REST

curl -X POST "https://generativelanguage.googleapis.com/v1beta/interactions" \
-H "Content-Type: application/json" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-d '{
    "model": "gemini-2.5-flash",
    "input": "Calculate the 50th Fibonacci number.",
    "tools": [{"type": "code_execution"}]
}'
網址背景資訊

Python

from google import genai

client = genai.Client()

interaction = client.interactions.create(
    model="gemini-2.5-flash",
    input="Summarize the content of https://www.wikipedia.org/",
    tools=[{"type": "url_context"}]
)
# Find the text output (not the URLContextResultContent)
text_output = next((o for o in interaction.outputs if o.type == "text"), None)
if text_output:
    print(text_output.text)

JavaScript

import { GoogleGenAI } from '@google/genai';

const client = new GoogleGenAI({});

const interaction = await client.interactions.create({
    model: 'gemini-2.5-flash',
    input: 'Summarize the content of https://www.wikipedia.org/',
    tools: [{ type: 'url_context' }]
});
// Find the text output (not the URLContextResultContent)
const textOutput = interaction.outputs.find(o => o.type === 'text');
if (textOutput) console.log(textOutput.text);

REST

curl -X POST "https://generativelanguage.googleapis.com/v1beta/interactions" \
-H "Content-Type: application/json" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-d '{
    "model": "gemini-2.5-flash",
    "input": "Summarize the content of https://www.wikipedia.org/",
    "tools": [{"type": "url_context"}]
}'

遠端模型內容通訊協定 (MCP)

遠端 MCP 整合功能可讓 Gemini API 直接呼叫遠端伺服器上託管的外部工具,簡化代理程式開發作業。

Python

from google import genai

client = genai.Client()

mcp_server = {
    "type": "mcp_server",
    "name": "weather_service",
    "url": "https://gemini-api-demos.uc.r.appspot.com/mcp"
}

interaction = client.interactions.create(
    model="gemini-2.5-flash",
    input="What is the weather like in New York today?",
    tools=[mcp_server]
)

print(interaction.outputs[-1].text)

JavaScript

import { GoogleGenAI } from '@google/genai';

const client = new GoogleGenAI({});

const mcpServer = {
    type: 'mcp_server',
    name: 'weather_service',
    url: 'https://gemini-api-demos.uc.r.appspot.com/mcp'
};

const interaction = await client.interactions.create({
    model: 'gemini-2.5-flash',
    input: 'What is the weather like in New York today?',
    tools: [mcpServer]
});

console.log(interaction.outputs[interaction.outputs.length - 1].text);

REST

curl -X POST "https://generativelanguage.googleapis.com/v1beta/interactions" \
-H "Content-Type: application/json" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-d '{
    "model": "gemini-2.5-flash",
    "input": "What is the weather like in New York today?",
    "tools": [{
        "type": "mcp_server",
        "name": "weather_service",
        "url": "https://gemini-api-demos.uc.r.appspot.com/mcp"
    }]
}'

結構化輸出內容 (JSON 結構定義)

response_format 參數中提供 JSON 結構定義,強制模型以特定 JSON 格式輸出內容。這項功能適用於內容審查、分類或資料擷取等工作。

Python

from google import genai
from pydantic import BaseModel, Field
from typing import Literal, Union
client = genai.Client()

class SpamDetails(BaseModel):
    reason: str = Field(description="The reason why the content is considered spam.")
    spam_type: Literal["phishing", "scam", "unsolicited promotion", "other"]

class NotSpamDetails(BaseModel):
    summary: str = Field(description="A brief summary of the content.")
    is_safe: bool = Field(description="Whether the content is safe for all audiences.")

class ModerationResult(BaseModel):
    decision: Union[SpamDetails, NotSpamDetails]

interaction = client.interactions.create(
    model="gemini-2.5-flash",
    input="Moderate the following content: 'Congratulations! You've won a free cruise. Click here to claim your prize: www.definitely-not-a-scam.com'",
    response_format=ModerationResult.model_json_schema(),
)

parsed_output = ModerationResult.model_validate_json(interaction.outputs[-1].text)
print(parsed_output)

JavaScript

import { GoogleGenAI } from '@google/genai';
import { z } from 'zod';
const client = new GoogleGenAI({});

const moderationSchema = z.object({
    decision: z.union([
        z.object({
            reason: z.string().describe('The reason why the content is considered spam.'),
            spam_type: z.enum(['phishing', 'scam', 'unsolicited promotion', 'other']).describe('The type of spam.'),
        }).describe('Details for content classified as spam.'),
        z.object({
            summary: z.string().describe('A brief summary of the content.'),
            is_safe: z.boolean().describe('Whether the content is safe for all audiences.'),
        }).describe('Details for content classified as not spam.'),
    ]),
});

const interaction = await client.interactions.create({
    model: 'gemini-2.5-flash',
    input: "Moderate the following content: 'Congratulations! You've won a free cruise. Click here to claim your prize: www.definitely-not-a-scam.com'",
    response_format: z.toJSONSchema(moderationSchema),
});
console.log(interaction.outputs[0].text);

REST

curl -X POST "https://generativelanguage.googleapis.com/v1beta/interactions" \
-H "Content-Type: application/json" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-d '{
    "model": "gemini-2.5-flash",
    "input": "Moderate the following content: 'Congratulations! You've won a free cruise. Click here to claim your prize: www.definitely-not-a-scam.com'",
    "response_format": {
        "type": "object",
        "properties": {
            "decision": {
                "type": "object",
                "properties": {
                    "reason": {"type": "string", "description": "The reason why the content is considered spam."},
                    "spam_type": {"type": "string", "description": "The type of spam."}
                },
                "required": ["reason", "spam_type"]
            }
        },
        "required": ["decision"]
    }
}'

結合工具和結構化輸出內容

結合內建工具和結構化輸出內容,根據工具擷取的資訊取得可靠的 JSON 物件。

Python

from google import genai
from pydantic import BaseModel, Field
from typing import Literal, Union

client = genai.Client()

class SpamDetails(BaseModel):
    reason: str = Field(description="The reason why the content is considered spam.")
    spam_type: Literal["phishing", "scam", "unsolicited promotion", "other"]

class NotSpamDetails(BaseModel):
    summary: str = Field(description="A brief summary of the content.")
    is_safe: bool = Field(description="Whether the content is safe for all audiences.")

class ModerationResult(BaseModel):
    decision: Union[SpamDetails, NotSpamDetails]

interaction = client.interactions.create(
    model="gemini-3-pro-preview",
    input="Moderate the following content: 'Congratulations! You've won a free cruise. Click here to claim your prize: www.definitely-not-a-scam.com'",
    response_format=ModerationResult.model_json_schema(),
    tools=[{"type": "url_context"}]
)

parsed_output = ModerationResult.model_validate_json(interaction.outputs[-1].text)
print(parsed_output)

JavaScript

import { GoogleGenAI } from '@google/genai';
import { z } from 'zod'; // Assuming zod is used for schema generation, or define manually
const client = new GoogleGenAI({});

const obj = z.object({
    winning_team: z.string(),
    score: z.string(),
});
const schema = z.toJSONSchema(obj);

const interaction = await client.interactions.create({
    model: 'gemini-3-pro-preview',
    input: 'Who won the last euro?',
    tools: [{ type: 'google_search' }],
    response_format: schema,
});
console.log(interaction.outputs[0].text);

REST

curl -X POST "https://generativelanguage.googleapis.com/v1beta/interactions" \
-H "Content-Type: application/json" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-d '{
    "model": "gemini-3-pro-preview",
    "input": "Who won the last euro?",
    "tools": [{"type": "google_search"}],
    "response_format": {
        "type": "object",
        "properties": {
            "winning_team": {"type": "string"},
            "score": {"type": "string"}
        }
    }
}'

進階功能

此外,還有其他進階功能,可讓您更彈性地使用 Interactions API。

串流

在生成回覆的同時接收回覆。

Python

from google import genai

client = genai.Client()

stream = client.interactions.create(
    model="gemini-2.5-flash",
    input="Explain quantum entanglement in simple terms.",
    stream=True
)

for chunk in stream:
    if chunk.event_type == "content.delta":
        if chunk.delta.type == "text":
            print(chunk.delta.text, end="", flush=True)
        elif chunk.delta.type == "thought":
            print(chunk.delta.thought, end="", flush=True)
    elif chunk.event_type == "interaction.complete":
        print(f"\n\n--- Stream Finished ---")
        print(f"Total Tokens: {chunk.interaction.usage.total_tokens}")

JavaScript

import { GoogleGenAI } from '@google/genai';

const client = new GoogleGenAI({});

const stream = await client.interactions.create({
    model: 'gemini-2.5-flash',
    input: 'Explain quantum entanglement in simple terms.',
    stream: true,
});

for await (const chunk of stream) {
    if (chunk.event_type === 'content.delta') {
        if (chunk.delta.type === 'text' && 'text' in chunk.delta) {
            process.stdout.write(chunk.delta.text);
        } else if (chunk.delta.type === 'thought' && 'thought' in chunk.delta) {
            process.stdout.write(chunk.delta.thought);
        }
    } else if (chunk.event_type === 'interaction.complete') {
        console.log('\n\n--- Stream Finished ---');
        console.log(`Total Tokens: ${chunk.interaction.usage.total_tokens}`);
    }
}

REST

curl -X POST "https://generativelanguage.googleapis.com/v1beta/interactions?alt=sse" \
-H "Content-Type: application/json" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-d '{
    "model": "gemini-2.5-flash",
    "input": "Explain quantum entanglement in simple terms.",
    "stream": true
}'

設定

使用 generation_config 自訂模型行為。

Python

from google import genai

client = genai.Client()

interaction = client.interactions.create(
    model="gemini-2.5-flash",
    input="Tell me a story about a brave knight.",
    generation_config={
        "temperature": 0.7,
        "max_output_tokens": 500,
        "thinking_level": "low",
    }
)

print(interaction.outputs[-1].text)

JavaScript

import { GoogleGenAI } from '@google/genai';

const client = new GoogleGenAI({});

const interaction = await client.interactions.create({
    model: 'gemini-2.5-flash',
    input: 'Tell me a story about a brave knight.',
    generation_config: {
        temperature: 0.7,
        max_output_tokens: 500,
        thinking_level: 'low',
    }
});

console.log(interaction.outputs[interaction.outputs.length - 1].text);

REST

curl -X POST "https://generativelanguage.googleapis.com/v1beta/interactions" \
-H "Content-Type: application/json" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-d '{
    "model": "gemini-2.5-flash",
    "input": "Tell me a story about a brave knight.",
    "generation_config": {
        "temperature": 0.7,
        "max_output_tokens": 500,
        "thinking_level": "low"
    }
}'

處理檔案

使用遠端檔案

直接在 API 呼叫中使用遠端網址存取檔案。

Python

from google import genai
client = genai.Client()

interaction = client.interactions.create(
    model="gemini-2.5-flash",
    input=[
        {
            "type": "image",
            "uri": "https://github.com/<github-path>/cats-and-dogs.jpg",
        },
        {"type": "text", "text": "Describe what you see."}
    ],
)
for output in interaction.outputs:
    if output.type == "text":
        print(output.text)

JavaScript

import { GoogleGenAI } from '@google/genai';
const client = new GoogleGenAI({});

const interaction = await client.interactions.create({
    model: 'gemini-2.5-flash',
    input: [
        {
            type: 'image',
            uri: 'https://github.com/<github-path>/cats-and-dogs.jpg',
        },
        { type: 'text', text: 'Describe what you see.' }
    ],
});
for (const output of interaction.outputs) {
    if (output.type === 'text') {
        console.log(output.text);
    }
}

REST

curl -X POST "https://generativelanguage.googleapis.com/v1beta/interactions" \
-H "Content-Type: application/json" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-d '{
    "model": "gemini-2.5-flash",
    "input": [
        {
            "type": "image",
            "uri": "https://github.com/<github-path>/cats-and-dogs.jpg"
        },
        {"type": "text", "text": "Describe what you see."}
    ]
}'

使用 Gemini Files API

請先將檔案上傳至 Gemini Files API,再使用這些檔案。

Python

from google import genai
import time
import requests
client = genai.Client()

# 1. Download the file
url = "https://github.com/philschmid/gemini-samples/raw/refs/heads/main/assets/cats-and-dogs.jpg"
response = requests.get(url)
with open("cats-and-dogs.jpg", "wb") as f:
    f.write(response.content)

# 2. Upload to Gemini Files API
file = client.files.upload(file="cats-and-dogs.jpg")

# 3. Wait for processing
while client.files.get(name=file.name).state != "ACTIVE":
    time.sleep(2)

# 4. Use in Interaction
interaction = client.interactions.create(
    model="gemini-2.5-flash",
    input=[
        {
            "type": "image",
            "uri": file.uri,
        },
        {"type": "text", "text": "Describe what you see."}
    ],
)
for output in interaction.outputs:
    if output.type == "text":
        print(output.text)

JavaScript

import { GoogleGenAI } from '@google/genai';
import * as fs from 'fs';
import fetch from 'node-fetch';
const client = new GoogleGenAI({});

// 1. Download the file
const url = 'https://github.com/philschmid/gemini-samples/raw/refs/heads/main/assets/cats-and-dogs.jpg';
const filename = 'cats-and-dogs.jpg';
const response = await fetch(url);
const buffer = await response.buffer();
fs.writeFileSync(filename, buffer);

// 2. Upload to Gemini Files API
const myfile = await client.files.upload({ file: filename, config: { mimeType: 'image/jpeg' } });

// 3. Wait for processing
while ((await client.files.get({ name: myfile.name })).state !== 'ACTIVE') {
    await new Promise(resolve => setTimeout(resolve, 2000));
}

// 4. Use in Interaction
const interaction = await client.interactions.create({
    model: 'gemini-2.5-flash',
    input: [
        { type: 'image', uri: myfile.uri, },
        { type: 'text', text: 'Describe what you see.' }
    ],
});
for (const output of interaction.outputs) {
    if (output.type === 'text') {
        console.log(output.text);
    }
}

REST

# 1. Upload the file (Requires File API setup)
# See https://ai.google.dev/gemini-api/docs/files for details.
# Assume FILE_URI is obtained from the upload step.

curl -X POST "https://generativelanguage.googleapis.com/v1beta/interactions" \
-H "Content-Type: application/json" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-d '{
    "model": "gemini-2.5-flash",
    "input": [
        {"type": "image", "uri": "FILE_URI"},
        {"type": "text", "text": "Describe what you see."}
    ]
}'

資料模型

如要進一步瞭解資料模型,請參閱 API 參考資料。以下是主要元件的概要總覽。

互動

屬性 類型 說明
id string 互動的專屬 ID。
model/agent string 使用的模型或代理程式。只能提供一個。
input Content[] 提供的輸入內容。
outputs Content[] 模型的回覆。
tools Tool[] 使用的工具。
previous_interaction_id string 先前互動的 ID,用於提供脈絡資訊。
stream boolean 互動是否為串流。
status string 狀態:completedin_progressrequires_actionfailed 等。
background boolean 互動是否處於背景模式。
store boolean 是否要儲存互動內容。預設值:true。如要停用,請設為 false
usage 用法 互動要求的權杖用量。

支援的模型和代理程式

模型名稱 類型 模型 ID
Gemini 2.5 Pro 型號 gemini-2.5-pro
Gemini 2.5 Flash 型號 gemini-2.5-flash
Gemini 2.5 Flash-Lite 型號 gemini-2.5-flash-lite
Gemini 3 Pro 預先發布版 型號 gemini-3-pro-preview
Deep Research 搶先版 代理 deep-research-pro-preview-12-2025

Interactions API 的運作方式

Interactions API 是以中央資源 Interaction 為中心設計。Interaction 代表對話或工作中的完整回合。這項物件會做為工作階段記錄,包含互動的完整記錄,包括所有使用者輸入內容、模型想法、工具呼叫、工具結果和最終模型輸出內容。

呼叫 interactions.create 時,您會建立新的 Interaction 資源。

或者,您也可以在後續呼叫中使用這個資源的 id,並透過 previous_interaction_id 參數繼續對話。伺服器會使用這個 ID 擷取完整脈絡,因此您不必重新傳送整個對話記錄。這項伺服器端狀態管理功能為選用功能,您也可以在每個要求中傳送完整的對話記錄,以無狀態模式運作。

資料儲存與保留

根據預設,所有 Interaction 物件都會儲存 (store=true),以便簡化伺服器端狀態管理功能 (使用 previous_interaction_id)、背景執行 (使用 background=true) 和可觀測性的使用。

  • 付費層級:互動記錄會保留 55 天
  • 免費方案:互動記錄會保留 1 天

如果不希望發生這種情況,可以在要求中設定 store=false。這項控制項與狀態管理功能無關,您可以選擇不儲存任何互動。不過請注意,store=falsebackground=true 不相容,且會阻止後續回合使用 previous_interaction_id

您隨時可以使用 API 參考資料中的刪除方法,刪除儲存的互動記錄。只有在知道互動 ID 的情況下,才能刪除互動。

保留期限過後,系統會自動刪除資料。

系統會根據條款處理互動物件。

最佳做法

  • 快取命中率:使用 previous_interaction_id 繼續對話時,系統可以更輕鬆地利用對話記錄的隱含快取,進而提升效能並降低成本。
  • 混合互動:您可以在對話中彈性混合搭配代理程式和模型互動。舉例來說,您可以先使用 Deep Research 代理等專用代理收集初始資料,然後使用標準 Gemini 模型執行後續工作,例如摘要或重新格式化,並使用 previous_interaction_id 連結這些步驟。

SDK

您可以使用最新版的 Google GenAI SDK,存取 Interactions API。

  • 在 Python 中,這是 1.55.0 版本之後的 google-genai 套件。
  • 在 JavaScript 中,這是 @google/genai 套件的 1.33.0 版本。

如要進一步瞭解如何安裝 SDK,請參閱「程式庫」頁面。

限制

  • Beta 版狀態:Interactions API 目前為 Beta 版/搶先體驗版。功能和結構定義可能會有所異動。
  • 不支援的功能: 下列功能目前不支援,但很快就會推出:

  • 輸出內容排序:內建工具 (google_searchurl_context) 的內容排序有時可能不正確,文字會顯示在工具執行和結果之前。我們已知這個問題,正在設法解決。

  • 工具組合:目前尚不支援組合使用 MCP、函式呼叫和內建工具,但這項功能即將推出。

  • 遠端 MCP:Gemini 3 不支援遠端 MCP,但這項功能即將推出。

破壞性變更

目前 Interactions API 處於早期 Beta 版階段。我們會根據實際使用情況和開發人員意見回饋,積極開發及改善 API 功能、資源結構定義和 SDK 介面。

因此可能會發生破壞性變更。 更新內容可能包括:

  • 輸入和輸出內容的結構定義。
  • SDK 方法簽章和物件結構。
  • 特定功能行為。

如為正式版工作負載,請繼續使用標準的 generateContent API。這個版本仍是穩定部署的建議路徑,且會持續積極開發及維護。

意見回饋

您的意見回饋對 Interactions API 的開發至關重要。歡迎前往 Google AI 開發人員產品討論社群論壇分享想法、回報錯誤或要求功能。

後續步驟