Gemini API

Gemini Interactions API 是一项实验性 API，可让开发者使用 Gemini 模型构建生成式 AI 应用。Gemini 是 Google 旗下最强大的模型，专为多模态应用而生。它不仅能理解和处理语言、图像、音频、视频和代码等各种信息，更能跨越不同模态，实现信息的融会贯通。您可以使用 Gemini API 来实现各种用例，例如跨文本和图像进行推理、生成内容、构建对话代理、开发总结和分类系统等。

以 Markdown 格式查看查看 OpenAPI 规范

创建互动

post https://generativelanguage.googleapis.com/v1beta/interactions

创建新的互动。

请求正文
答案

请求正文

请求正文中包含结构如下的数据：

model ModelOption （可选）

用于生成互动的 `Model` 的名称。
如果未提供 `agent`，则为必需属性。

可能的值：

gemini-2.5-pro
Google 旗下先进的多用途模型，擅长编码和处理复杂的推理任务。
gemini-2.5-flash
我们的首个混合推理模型，支持 100 万个 token 的上下文窗口，并具有思考预算。
gemini-2.5-flash-preview-09-2025
基于 2.5 Flash 模型的最新模型。2.5 Flash 预览版最适合大规模处理、低延迟、需要思考的高数据量任务，以及代理应用场景。
gemini-2.5-flash-lite
Google 旗下最小巧且最具成本效益的模型，专为大规模使用而打造。
gemini-2.5-flash-lite-preview-09-2025
基于 Gemini 2.5 Flash Lite 的最新模型，经过优化，可实现高成本效益、高吞吐量和高质量。
gemini-2.5-flash-preview-native-audio-dialog
我们的原生音频模型经过优化，可提供更高质量的音频输出，并能更好地控制语速、声音自然度、表达详略和情绪。
gemini-2.5-flash-image-preview
我们的原生图片生成模型，在速度、灵活性和内容相关理解方面经过专门优化。文本输入和输出的价格与 2.5 Flash 相同。
gemini-2.5-pro-preview-tts
我们的 2.5 Pro 文字转语音音频模型经过优化，可实现强大的低延迟语音生成，从而提供更自然的输出，并更轻松地引导提示。
gemini-3-pro-preview
我们最智能的模型，具有出色的推理和多模态理解能力，以及强大的智能体和氛围编码能力。

agent AgentOption （可选）

用于生成互动的“代理”的名称。
如果未提供“model”，则为必需属性。

可能的值：

deep-research-pro-preview-12-2025
Gemini Deep Research Agent

input 内容或数组（内容）或数组（轮次）或字符串（必需）

交互的输入（模型和代理共用）。

system_instruction string （可选）

互动的系统指令。

tools 数组 (Tool) （可选）

模型在互动期间可能会调用的工具声明列表。

response_format object （可选）

强制生成的回答为符合此字段中指定的 JSON 架构的 JSON 对象。

response_mime_type string （可选）

响应的 MIME 类型。如果设置了 response_format，则此字段为必填字段。

stream boolean （选填）

仅限输入。互动是否会以流式传输方式进行。

store boolean （选填）

仅限输入。是否存储响应和请求以供日后检索。

background boolean （可选）

是否在后台运行模型交互。

generation_config GenerationConfig （可选）

模型配置
模型互动的配置参数。
`agent_config` 的替代方案。仅在设置了 `model` 时适用。

字段

温度数字（可选）

控制输出的随机性。

top_p number （可选）

抽样时要考虑的 token 的最大累积概率。

seed integer (optional)

解码中使用的种子，用于实现可重现性。

stop_sequences array (string) （选填）

将停止输出互动的字符序列列表。

tool_choice ToolChoice （可选）

互动所用的工具。

可能的类型

ToolChoiceType

此类型没有特定字段。

ToolChoiceConfig

allowed_tools AllowedTools （可选）

没有提供说明。

字段

mode ToolChoiceType （可选）

工具选择的模式。

可能的值：

auto
any
none
validated

工具数组（字符串）（可选）

允许使用的工具的名称。

thinking_level ThinkingLevel （可选）

模型应生成的思维令牌的级别。

可能的值：

low
high

thinking_summaries ThinkingSummaries （可选）

是否在回答中包含思维总结。

可能的值：

auto
none

max_output_tokens integer （可选）

响应中包含的令牌数量上限。

speech_config SpeechConfig （可选）

语音互动的配置。

字段

voice string （选填）

说话者的声音。

language string （选填）

语音的语言。

speaker string （选填）

说话者的姓名，应与提示中给出的说话者姓名一致。

agent_config object （可选）

代理配置
代理的配置。
`generation_config` 的替代方案。仅在设置了 `agent` 时适用。

可能的类型

多态鉴别器：type

DynamicAgentConfig

动态代理的配置。

type string (optional)

用作内容 oneof 的 OpenAPI 类型鉴别器。

一律设置为 "dynamic"

DeepResearchAgentConfig

Deep Research 代理的配置。

type string (optional)

用作内容 oneof 的 OpenAPI 类型鉴别器。

一律设置为 "deep-research"

thinking_summaries ThinkingSummaries （可选）

是否在回答中包含思维总结。

可能的值：

auto
none

previous_interaction_id string （可选）

上一次互动的 ID（如有）。

response_modalities ResponseModality （可选）

响应的请求模态（TEXT、IMAGE、AUDIO）。

可能的值：

text
image
audio

响应

返回 Interaction 资源。

简单请求

示例响应

{
  "created": "2025-11-26T12:25:15Z",
  "id": "v1_ChdPU0F4YWFtNkFwS2kxZThQZ05lbXdROBIXT1NBeGFhbTZBcEtpMWU4UGdOZW13UTg",
  "model": "gemini-2.5-flash",
  "object": "interaction",
  "outputs": [
    {
      "text": "Hello! I'm functioning perfectly and ready to assist you.\n\nHow are you doing today?",
      "type": "text"
    }
  ],
  "role": "model",
  "status": "completed",
  "updated": "2025-11-26T12:25:15Z",
  "usage": {
    "input_tokens_by_modality": [
      {
        "modality": "text",
        "tokens": 7
      }
    ],
    "total_cached_tokens": 0,
    "total_input_tokens": 7,
    "total_output_tokens": 20,
    "total_reasoning_tokens": 22,
    "total_tokens": 49,
    "total_tool_use_tokens": 0
  }
}

多轮

示例响应

{
  "id": "v1_ChdPU0F4YWFtNkFwS2kxZThQZ05lbXdROBIXT1NBeGFhbTZBcEtpMWU4UGdOZW13UTg",
  "model": "gemini-2.5-flash",
  "status": "completed",
  "object": "interaction",
  "created": "2025-11-26T12:22:47Z",
  "updated": "2025-11-26T12:22:47Z",
  "role": "model",
  "outputs": [
    {
      "type": "text",
      "text": "The capital of France is Paris."
    }
  ],
  "usage": {
    "input_tokens_by_modality": [
      {
        "modality": "text",
        "tokens": 50
      }
    ],
    "total_cached_tokens": 0,
    "total_input_tokens": 50,
    "total_output_tokens": 10,
    "total_reasoning_tokens": 0,
    "total_tokens": 60,
    "total_tool_use_tokens": 0
  }
}

图片输入

示例响应

{
  "id": "v1_ChdPU0F4YWFtNkFwS2kxZThQZ05lbXdROBIXT1NBeGFhbTZBcEtpMWU4UGdOZW13UTg",
  "model": "gemini-2.5-flash",
  "status": "completed",
  "object": "interaction",
  "created": "2025-11-26T12:22:47Z",
  "updated": "2025-11-26T12:22:47Z",
  "role": "model",
  "outputs": [
    {
      "type": "text",
      "text": "A white humanoid robot with glowing blue eyes stands holding a red skateboard."
    }
  ],
  "usage": {
    "input_tokens_by_modality": [
      {
        "modality": "text",
        "tokens": 10
      },
      {
        "modality": "image",
        "tokens": 258
      }
    ],
    "total_cached_tokens": 0,
    "total_input_tokens": 268,
    "total_output_tokens": 20,
    "total_reasoning_tokens": 0,
    "total_tokens": 288,
    "total_tool_use_tokens": 0
  }
}

函数调用

示例响应

{
  "id": "v1_ChdPU0F4YWFtNkFwS2kxZThQZ05lbXdROBIXT1NBeGFhbTZBcEtpMWU4UGdOZW13UTg",
  "model": "gemini-2.5-flash",
  "status": "requires_action",
  "object": "interaction",
  "created": "2025-11-26T12:22:47Z",
  "updated": "2025-11-26T12:22:47Z",
  "role": "model",
  "outputs": [
    {
      "type": "function_call",
      "function_call": {
        "name": "get_weather",
        "arguments": {
          "location": "Boston, MA"
        }
      }
    }
  ],
  "usage": {
    "input_tokens_by_modality": [
      {
        "modality": "text",
        "tokens": 100
      }
    ],
    "total_cached_tokens": 0,
    "total_input_tokens": 100,
    "total_output_tokens": 25,
    "total_reasoning_tokens": 0,
    "total_tokens": 125,
    "total_tool_use_tokens": 50
  }
}

Deep Research

示例响应

{
  "id": "v1_ChdPU0F4YWFtNkFwS2kxZThQZ05lbXdROBIXT1NBeGFhbTZBcEtpMWU4UGdOZW13UTg",
  "agent": "deep-research-pro-preview-12-2025",
  "status": "completed",
  "object": "interaction",
  "created": "2025-11-26T12:22:47Z",
  "updated": "2025-11-26T12:22:47Z",
  "role": "model",
  "outputs": [
    {
      "type": "text",
      "text": "Here is a comprehensive research report on the current state of cancer research..."
    }
  ],
  "usage": {
    "input_tokens_by_modality": [
      {
        "modality": "text",
        "tokens": 20
      }
    ],
    "total_cached_tokens": 0,
    "total_input_tokens": 20,
    "total_output_tokens": 1000,
    "total_reasoning_tokens": 500,
    "total_tokens": 1520,
    "total_tool_use_tokens": 0
  }
}

检索互动

get https://generativelanguage.googleapis.com/v1beta/interactions/{id}

根据单个互动的 `Interaction.id` 检索其完整详细信息。

路径 / 查询参数
答案

路径 / 查询参数

id string （必需）

要检索的互动的唯一标识符。

stream boolean （选填）

如果设置为 true，则会以增量方式流式传输生成的内容。

默认为：False

last_event_id string （选填）

可选。如果设置，则从由事件 ID 标记的事件之后的下一个块恢复互动流。仅当“stream”为 true 时才可使用。

api_version string （选填）

要使用的 API 版本。

响应

返回 Interaction 资源。

获取互动

示例响应

{
  "id": "v1_ChdPU0F4YWFtNkFwS2kxZThQZ05lbXdROBIXT1NBeGFhbTZBcEtpMWU4UGdOZW13UTg",
  "model": "gemini-2.5-flash",
  "status": "completed",
  "object": "interaction",
  "created": "2025-11-26T12:25:15Z",
  "updated": "2025-11-26T12:25:15Z",
  "role": "model",
  "outputs": [
    {
      "type": "text",
      "text": "I'm doing great, thank you for asking! How can I help you today?"
    }
  ]
}

删除互动

delete https://generativelanguage.googleapis.com/v1beta/interactions/{id}

按 ID 删除互动。

路径 / 查询参数
答案

路径 / 查询参数

id string （必需）

要删除的互动的唯一标识符。

api_version string （选填）

要使用的 API 版本。

响应

如果成功，则响应为空。

删除互动

取消互动

post https://generativelanguage.googleapis.com/v1beta/interactions/{id}/cancel

按 ID 取消互动。这仅适用于仍在运行的后台互动。

路径 / 查询参数
答案

路径 / 查询参数

id string （必需）

要检索的互动的唯一标识符。

api_version string （选填）

要使用的 API 版本。

响应

返回 Interaction 资源。

取消互动

示例响应

{
  "id": "v1_ChdPU0F4YWFtNkFwS2kxZThQZ05lbXdROBIXT1NBeGFhbTZBcEtpMWU4UGdOZW13UTg",
  "agent": "deep-research-pro-preview-12-2025",
  "status": "cancelled",
  "object": "interaction",
  "created": "2025-11-26T12:25:15Z",
  "updated": "2025-11-26T12:25:15Z",
  "role": "model"
}

资源

互动

Interaction 资源。

字段

model ModelOption （可选）

用于生成互动的 `Model` 的名称。

可能的值：

gemini-2.5-pro
Google 旗下先进的多用途模型，擅长编码和处理复杂的推理任务。
gemini-2.5-flash
我们的首个混合推理模型，支持 100 万个 token 的上下文窗口，并具有思考预算。
gemini-2.5-flash-preview-09-2025
基于 2.5 Flash 模型的最新模型。2.5 Flash 预览版最适合大规模处理、低延迟、需要思考的高数据量任务，以及代理应用场景。
gemini-2.5-flash-lite
Google 旗下最小巧且最具成本效益的模型，专为大规模使用而打造。
gemini-2.5-flash-lite-preview-09-2025
基于 Gemini 2.5 Flash Lite 的最新模型，经过优化，可实现高成本效益、高吞吐量和高质量。
gemini-2.5-flash-preview-native-audio-dialog
我们的原生音频模型经过优化，可提供更高质量的音频输出，并能更好地控制语速、声音自然度、表达详略和情绪。
gemini-2.5-flash-image-preview
我们的原生图片生成模型，在速度、灵活性和内容相关理解方面经过专门优化。文本输入和输出的价格与 2.5 Flash 相同。
gemini-2.5-pro-preview-tts
我们的 2.5 Pro 文字转语音音频模型经过优化，可实现强大的低延迟语音生成，从而提供更自然的输出，并更轻松地引导提示。
gemini-3-pro-preview
我们最智能的模型，具有出色的推理和多模态理解能力，以及强大的智能体和氛围编码能力。

agent AgentOption （可选）

用于生成互动的“代理”的名称。

可能的值：

deep-research-pro-preview-12-2025
Gemini Deep Research Agent

id string （选填）

仅限输出。互动完成的唯一标识符。

状态枚举（字符串）（选填）

仅限输出。互动的状态。

可能的值：

in_progress
requires_action
completed
failed
cancelled

创建字符串（选填）

仅限输出。回答的创建时间，采用 ISO 8601 格式 (YYYY-MM-DDThh:mm:ssZ)。

更新字符串（选填）

仅限输出。回答的上次更新时间，采用 ISO 8601 格式 (YYYY-MM-DDThh:mm:ssZ)。

角色字符串（选填）

仅限输出。互动的角色。

outputs 数组（内容）（可选）

仅限输出。模型给出的回答。

object string （选填）

仅限输出。互动的对象类型。始终设置为“interaction”。

一律设置为 "interaction"

使用情况使用情况（可选）

仅限输出。互动请求的令牌使用情况统计信息。

字段

total_input_tokens integer （可选）

提示（上下文）中的 token 数量。

input_tokens_by_modality ModalityTokens (可选)

按模态划分的输入 token 使用情况细分。

字段

modality ResponseModality （可选）

与令牌数量关联的模态。

可能的值：

text
image
audio

token 整数（可选）

模态的令牌数量。

total_cached_tokens integer （可选）

提示的缓存部分（即缓存的内容）中的 token 数量。

cached_tokens_by_modality ModalityTokens （可选）

按模态划分的缓存令牌使用情况细分。

字段

modality ResponseModality （可选）

与令牌数量关联的模态。

可能的值：

text
image
audio

token 整数（可选）

模态的令牌数量。

total_output_tokens integer （可选）

所有生成的回答中的 token 总数。

output_tokens_by_modality ModalityTokens （可选）

按模态划分的输出 token 用量细分。

字段

modality ResponseModality （可选）

与令牌数量关联的模态。

可能的值：

text
image
audio

token 整数（可选）

模态的令牌数量。

total_tool_use_tokens integer （可选）

工具使用提示中的 token 数量。

tool_use_tokens_by_modality ModalityTokens （可选）

按模态划分的工具使用情况令牌用量细分。

字段

modality ResponseModality （可选）

与令牌数量关联的模态。

可能的值：

text
image
audio

token 整数（可选）

模态的令牌数量。

total_reasoning_tokens integer （可选）

思考模型的思考 token 数。

total_tokens integer （可选）

互动请求（提示 + 回答 + 其他内部 token）的总 token 数。

previous_interaction_id string （可选）

上一次互动的 ID（如有）。

示例

{
  "created": "2025-12-04T15:01:45Z",
  "id": "v1_ChdXS0l4YWZXTk9xbk0xZThQczhEcmlROBIXV0tJeGFmV05PcW5NMWU4UHM4RHJpUTg",
  "model": "gemini-2.5-flash",
  "object": "interaction",
  "outputs": [
    {
      "text": "Hello! I'm doing well, functioning as expected. Thank you for asking! How are you doing today?",
      "type": "text"
    }
  ],
  "role": "model",
  "status": "completed",
  "updated": "2025-12-04T15:01:45Z",
  "usage": {
    "input_tokens_by_modality": [
      {
        "modality": "text",
        "tokens": 7
      }
    ],
    "total_cached_tokens": 0,
    "total_input_tokens": 7,
    "total_output_tokens": 23,
    "total_reasoning_tokens": 49,
    "total_tokens": 79,
    "total_tool_use_tokens": 0
  }
}

数据模型

内容

回答的内容。

可能的类型

多态鉴别器：type

TextContent

文本内容块。

文本字符串（选填）

文本内容。

type string （必填）

用作内容 oneof 的 OpenAPI 类型鉴别器。

一律设置为 "text"

注释注释（可选）

模型生成的内容的引用信息。

字段

start_index integer （可选）

归因于此来源的回答部分的起始位置。索引指示段落的开始，以字节为单位衡量。

end_index integer （可选）

归因段落的结束，不包括此索引。

source string （选填）

文本部分归因的来源。可以是网址、标题或其他标识符。

ImageContent

图片内容块。

数据字符串（可选）

没有提供说明。

uri string （选填）

没有提供说明。

mime_type ImageMimeTypeOption （可选）

没有提供说明。

可能的值：

image/png
image/jpeg
image/webp
image/heic
image/heif

type string （必填）

用作内容 oneof 的 OpenAPI 类型鉴别器。

一律设置为 "image"

分辨率 MediaResolution （可选）

媒体的分辨率。

可能的值：

low
medium
high

AudioContent

音频内容块。

数据字符串（可选）

没有提供说明。

uri string （选填）

没有提供说明。

mime_type AudioMimeTypeOption （可选）

没有提供说明。

可能的值：

audio/wav
audio/mp3
audio/aiff
audio/aac
audio/ogg
audio/flac

type string （必填）

用作内容 oneof 的 OpenAPI 类型鉴别器。

一律设置为 "audio"

DocumentContent

文档内容块。

数据字符串（可选）

没有提供说明。

uri string （选填）

没有提供说明。

mime_type string （可选）

没有提供说明。

type string （必填）

用作内容 oneof 的 OpenAPI 类型鉴别器。

一律设置为 "document"

VideoContent

视频内容块。

数据字符串（可选）

没有提供说明。

uri string （选填）

没有提供说明。

mime_type VideoMimeTypeOption （可选）

没有提供说明。

可能的值：

video/mp4
video/mpeg
video/mov
video/avi
video/x-flv
video/mpg
video/webm
video/wmv
video/3gpp

type string （必填）

用作内容 oneof 的 OpenAPI 类型鉴别器。

一律设置为 "video"

分辨率 MediaResolution （可选）

媒体的分辨率。

可能的值：

low
medium
high

ThoughtContent

一种想法内容块。

签名字符串（选填）

与要纳入生成内容的后端来源相匹配的签名。

type string （必填）

用作内容 oneof 的 OpenAPI 类型鉴别器。

一律设置为 "thought"

总结 ThoughtSummary （可选）

想法的摘要。

FunctionCallContent

函数工具调用内容块。

name string （必填）

要调用的工具的名称。

实参对象（必需）

要传递给函数的实参。

type string （必填）

用作内容 oneof 的 OpenAPI 类型鉴别器。

一律设置为 "function_call"

id string （必需）

此特定工具调用的唯一 ID。

FunctionResultContent

函数工具结果内容块。

名称字符串（可选）

所调用工具的名称。

is_error 布尔值（选填）

工具调用是否导致了错误。

type string （必填）

用作内容 oneof 的 OpenAPI 类型鉴别器。

一律设置为 "function_result"

结果对象或字符串（必需）

工具调用的结果。

call_id string （必需）

用于与函数调用块中的 ID 相匹配的 ID。

CodeExecutionCallContent

代码执行内容。

实参 CodeExecutionCallArguments （可选）

要传递给代码执行的实参。

字段

language enum (string) （选填）

相应 `code` 的编程语言。

可能的值：

python

代码字符串（选填）

要执行的代码。

type string （必填）

用作内容 oneof 的 OpenAPI 类型鉴别器。

一律设置为 "code_execution_call"

id string （选填）

此特定工具调用的唯一 ID。

CodeExecutionResultContent

代码执行结果内容。

结果字符串（可选）

代码执行的输出。

is_error boolean （选填）

代码执行是否导致了错误。

签名字符串（选填）

用于后端验证的签名哈希。

type string （必填）

用作内容 oneof 的 OpenAPI 类型鉴别器。

一律设置为 "code_execution_result"

call_id string （选填）

用于与代码执行调用块中的 ID 相匹配的 ID。

UrlContextCallContent

网址上下文内容。

arguments UrlContextCallArguments （可选）

要传递给网址上下文的实参。

字段

urls array (string) （选填）

要提取的网址。

type string （必填）

用作内容 oneof 的 OpenAPI 类型鉴别器。

一律设置为 "url_context_call"

id string （选填）

此特定工具调用的唯一 ID。

UrlContextResultContent

网址上下文结果内容。

签名字符串（选填）

网址上下文结果的签名。

result UrlContextResult （可选）

网址上下文的结果。

字段

url string （选填）

提取的网址。

状态枚举（字符串）（选填）

网址检索的状态。

可能的值：

success
error
paywall
unsafe

is_error 布尔值（选填）

网址上下文是否导致了错误。

type string （必填）

用作内容 oneof 的 OpenAPI 类型鉴别器。

一律设置为 "url_context_result"

call_id string （选填）

用于与网址上下文调用块中的 ID 相匹配的 ID。

GoogleSearchCallContent

Google 搜索内容。

实参 GoogleSearchCallArguments （可选）

要传递给 Google 搜索的实参。

字段

查询数组（字符串）（可选）

后续网络搜索的网页搜索查询。

type string （必填）

用作内容 oneof 的 OpenAPI 类型鉴别器。

一律设置为 "google_search_call"

id string （选填）

此特定工具调用的唯一 ID。

GoogleSearchResultContent

Google 搜索结果内容。

签名字符串（选填）

Google 搜索结果的签名。

result GoogleSearchResult （可选）

Google 搜索的结果。

字段

url string （选填）

搜索结果的 URI 引用。

标题字符串（选填）

搜索结果的标题。

rendered_content 字符串（可选）

可嵌入网页或应用 WebView 中的 Web 内容代码段。

is_error boolean （选填）

Google 搜索是否导致了错误。

type string （必填）

用作内容 oneof 的 OpenAPI 类型鉴别器。

一律设置为 "google_search_result"

call_id string （选填）

用于与 Google 搜索调用块中的 ID 相匹配的 ID。

McpServerToolCallContent

MCPServer 工具调用内容。

name string （必填）

所调用工具的名称。

server_name string （必需）

所用 MCP 服务器的名称。

实参对象（必需）

函数的实参 JSON 对象。

type string （必填）

用作内容 oneof 的 OpenAPI 类型鉴别器。

一律设置为 "mcp_server_tool_call"

id string （必需）

此特定工具调用的唯一 ID。

McpServerToolResultContent

MCPServer 工具结果内容。

名称字符串（可选）

相应工具调用的工具的名称。

server_name string （选填）

所用 MCP 服务器的名称。

type string （必填）

用作内容 oneof 的 OpenAPI 类型鉴别器。

一律设置为 "mcp_server_tool_result"

结果对象或字符串（必需）

工具调用的结果。

call_id string （必需）

与 MCP 服务器工具调用块中的 ID 相匹配的 ID。

FileSearchResultContent

文件搜索结果内容。

result FileSearchResult （可选）

文件搜索的结果。

字段

标题字符串（选填）

搜索结果的标题。

文本字符串（选填）

搜索结果的文本。

file_search_store 字符串（可选）

文件搜索存储区的名称。

type string （必填）

用作内容 oneof 的 OpenAPI 类型鉴别器。

一律设置为 "file_search_result"

示例

文本

{
  "type": "text",
  "text": "Hello, how are you?"
}

图片

{
  "type": "image",
  "data": "BASE64_ENCODED_IMAGE",
  "mime_type": "image/png"
}

音频

{
  "type": "audio",
  "data": "BASE64_ENCODED_AUDIO",
  "mime_type": "audio/wav"
}

文档

{
  "type": "document",
  "data": "BASE64_ENCODED_DOCUMENT",
  "mime_type": "application/pdf"
}

视频

{
  "type": "video",
  "uri": "https://www.youtube.com/watch?v=9hE5-98ZeCg"
}

思考

{
  "type": "thought",
  "summary": [
    {
      "type": "text",
      "text": "The user is asking about the weather. I should use the get_weather tool."
    }
  ],
  "signature": "CoMDAXLI2nynRYojJIy6B1Jh9os2crpWLfB0+19xcLsGG46bd8wjkF/6RNlRUdvHrXyjsHkG0BZFcuO/bPOyA6Xh5jANNgx82wPHjGExN8A4ZQn56FlMwyZoqFVQz0QyY1lfibFJ2zU3J87uw26OewzcuVX0KEcs+GIsZa3EA6WwqhbsOd3wtZB3Ua2Qf98VAWZTS5y/tWpql7jnU3/CU7pouxQr/Bwft3hwnJNesQ9/dDJTuaQ8Zprh9VRWf1aFFjpIueOjBRrlT3oW6/y/eRl/Gt9BQXCYTqg/38vHFUU4Wo/d9dUpvfCe/a3o97t2Jgxp34oFKcsVb4S5WJrykIkw+14DzVnTpCpbQNFckqvFLuqnJCkL0EQFtunBXI03FJpPu3T1XU6id8S7ojoJQZSauGUCgmaLqUGdMrd08oo81ecoJSLs51Re9N/lISGmjWFPGpqJLoGq6uo4FHz58hmeyXCgHG742BHz2P3MiH1CXHUT2J8mF6zLhf3SR9Qb3lkrobAh"
}

函数调用

{
  "type": "function_call",
  "name": "get_weather",
  "id": "gth23981",
  "arguments": {
    "location": "Boston, MA"
  }
}

函数结果

{
  "type": "function_result",
  "name": "get_weather",
  "call_id": "gth23981",
  "result": {
    "weather": "sunny"
  }
}

代码执行调用

{
  "type": "code_execution_call",
  "id": "call_123456",
  "arguments": {
    "language": "python",
    "code": "print('hello world')"
  }
}

代码执行结果

{
  "type": "code_execution_result",
  "call_id": "call_123456",
  "result": "hello world\n"
}

网址上下文调用

{
  "type": "url_context_call",
  "id": "call_123456",
  "arguments": {
    "urls": [
      "https://www.example.com"
    ]
  }
}

网址上下文结果

{
  "type": "url_context_result",
  "call_id": "call_123456",
  "result": [
    {
      "url": "https://www.example.com",
      "status": "SUCCESS"
    }
  ]
}

Google 搜索通话

{
  "type": "google_search_call",
  "id": "call_123456",
  "arguments": {
    "queries": [
      "weather in Boston"
    ]
  }
}

Google 搜索结果

{
  "type": "google_search_result",
  "call_id": "call_123456",
  "result": [
    {
      "url": "https://www.google.com/search?q=weather+in+Boston",
      "title": "Weather in Boston"
    }
  ]
}

MCP 服务器工具调用

{
  "type": "mcp_server_tool_call",
  "id": "call_123456",
  "name": "get_forecast",
  "server_name": "weather_server",
  "arguments": {
    "city": "London"
  }
}

MCP 服务器工具结果

{
  "type": "mcp_server_tool_result",
  "name": "get_forecast",
  "server_name": "weather_server",
  "call_id": "call_123456",
  "result": "sunny"
}

文件搜索结果

{
  "type": "file_search_result",
  "result": [
    {
      "text": "search result chunk",
      "file_search_store": "file_search_store"
    }
  ]
}

工具

可能的类型

多态鉴别器：type

函数

可供模型使用的工具。

名称字符串（可选）

函数的名称。

说明字符串（选填）

函数的说明。

参数对象（可选）

函数的参数的 JSON 架构。

type string （必填）

没有提供说明。

一律设置为 "function"

GoogleSearch

模型可用于搜索 Google 的工具。

type string （必填）

没有提供说明。

一律设置为 "google_search"

CodeExecution

一种可供模型用来执行代码的工具。

type string （必填）

没有提供说明。

一律设置为 "code_execution"

UrlContext

一种可供模型用来提取网址上下文的工具。

type string （必填）

没有提供说明。

一律设置为 "url_context"

ComputerUse

一种可供模型用于与计算机互动的工具。

type string （必填）

没有提供说明。

一律设置为 "computer_use"

environment enum (string) （选填）

正在运行的环境。

可能的值：

browser

excludedPredefinedFunctions array (string) （可选）

从模型调用中排除的预定义函数列表。

McpServer

MCPServer 是一种可由模型调用以执行操作的服务器。

type string （必填）

没有提供说明。

一律设置为 "mcp_server"

名称字符串（可选）

MCPServer 的名称。

url string （选填）

MCPServer 端点的完整网址。示例：“https://api.example.com/mcp”

headers object （可选）

可选：身份验证标头、超时等字段（如果需要）。

allowed_tools AllowedTools （可选）

允许使用的工具。

字段

mode ToolChoiceType （可选）

工具选择的模式。

可能的值：

auto
any
none
validated

工具数组（字符串）（可选）

允许使用的工具的名称。

FileSearch

一种可供模型用来搜索文件的工具。

file_search_store_names 数组（字符串）（可选）

要搜索的文件搜索存储区名称。

top_k integer （可选）

要检索的语义检索块数量。

metadata_filter string （可选）

要应用于语义检索文档和块的元数据过滤条件。

type string （必填）

没有提供说明。

一律设置为 "file_search"

示例

函数

GoogleSearch

CodeExecution

UrlContext

ComputerUse

McpServer

FileSearch

Turn

字段

角色字符串（选填）

相应回合的发起者。对于输入，必须为“user”；对于模型输出，必须为“model”。

content 数组（内容）或字符串（可选）

对话轮次的内容。

示例

用户回合

{
  "role": "user",
  "content": [
    {
      "type": "text",
      "text": "user turn"
    }
  ]
}

模特走秀

{
  "role": "model",
  "content": [
    {
      "type": "text",
      "text": "model turn"
    }
  ]
}

InteractionSseEvent

可能的类型

多态鉴别器：event_type

InteractionEvent

event_type enum (string) （可选）

没有提供说明。

可能的值：

interaction.start
interaction.complete

interaction Interaction （可选）

没有提供说明。

event_id string （选填）

用于从相应事件恢复互动流的 event_id 令牌。

InteractionStatusUpdate

interaction_id string (optional)

没有提供说明。

状态枚举（字符串）（选填）

没有提供说明。

可能的值：

in_progress
requires_action
completed
failed
cancelled

event_type string （可选）

没有提供说明。

一律设置为 "interaction.status_update"

event_id string （选填）

用于从相应事件恢复互动流的 event_id 令牌。

ContentStart

index integer （可选）

没有提供说明。

content 内容（可选）

没有提供说明。

event_type string （可选）

没有提供说明。

一律设置为 "content.start"

event_id string （选填）

用于从相应事件恢复互动流的 event_id 令牌。

ContentDelta

索引整数（可选）

没有提供说明。

event_type string （可选）

没有提供说明。

一律设置为 "content.delta"

event_id string （选填）

用于从相应事件恢复互动流的 event_id 令牌。

delta object （可选）

没有提供说明。

可能的类型

多态鉴别器：type

TextDelta

文本字符串（选填）

没有提供说明。

type string （必填）

用作内容 oneof 的 OpenAPI 类型鉴别器。

一律设置为 "text"

annotations Annotation （可选）

模型生成的内容的引用信息。

字段

start_index integer （可选）

归因于此来源的回答部分的起始位置。索引指示段落的开始，以字节为单位衡量。

end_index integer （可选）

归因段落的结束，不包括此索引。

source string （选填）

文本部分归因的来源。可以是网址、标题或其他标识符。

ImageDelta

数据字符串（可选）

没有提供说明。

uri string （选填）

没有提供说明。

mime_type ImageMimeTypeOption （可选）

没有提供说明。

可能的值：

image/png
image/jpeg
image/webp
image/heic
image/heif

type string （必填）

用作内容 oneof 的 OpenAPI 类型鉴别器。

一律设置为 "image"

分辨率 MediaResolution （可选）

媒体的分辨率。

可能的值：

low
medium
high

AudioDelta

数据字符串（可选）

没有提供说明。

uri string （选填）

没有提供说明。

mime_type AudioMimeTypeOption （可选）

没有提供说明。

可能的值：

audio/wav
audio/mp3
audio/aiff
audio/aac
audio/ogg
audio/flac

type string （必填）

用作内容 oneof 的 OpenAPI 类型鉴别器。

一律设置为 "audio"

DocumentDelta

数据字符串（可选）

没有提供说明。

uri string （选填）

没有提供说明。

mime_type string （可选）

没有提供说明。

type string （必填）

用作内容 oneof 的 OpenAPI 类型鉴别器。

一律设置为 "document"

VideoDelta

数据字符串（可选）

没有提供说明。

uri string （选填）

没有提供说明。

mime_type VideoMimeTypeOption （可选）

没有提供说明。

可能的值：

video/mp4
video/mpeg
video/mov
video/avi
video/x-flv
video/mpg
video/webm
video/wmv
video/3gpp

type string （必填）

用作内容 oneof 的 OpenAPI 类型鉴别器。

一律设置为 "video"

分辨率 MediaResolution （可选）

媒体的分辨率。

可能的值：

low
medium
high

ThoughtSummaryDelta

type string （必填）

用作内容 oneof 的 OpenAPI 类型鉴别器。

一律设置为 "thought_summary"

内容 ImageContent 或 TextContent （可选）

没有提供说明。

ThoughtSignatureDelta

签名字符串（选填）

与要纳入生成内容的后端来源相匹配的签名。

type string （必填）

用作内容 oneof 的 OpenAPI 类型鉴别器。

一律设置为 "thought_signature"

FunctionCallDelta

名称字符串（可选）

没有提供说明。

arguments object （可选）

没有提供说明。

type string （必填）

用作内容 oneof 的 OpenAPI 类型鉴别器。

一律设置为 "function_call"

id string （选填）

此特定工具调用的唯一 ID。

FunctionResultDelta

名称字符串（可选）

没有提供说明。

is_error boolean （选填）

没有提供说明。

type string （必填）

用作内容 oneof 的 OpenAPI 类型鉴别器。

一律设置为 "function_result"

结果对象或字符串（可选）

工具调用结果 delta。

call_id string （可选）

用于与函数调用块中的 ID 相匹配的 ID。

CodeExecutionCallDelta

实参 CodeExecutionCallArguments （可选）

没有提供说明。

字段

language enum (string) （选填）

相应 `code` 的编程语言。

可能的值：

python

代码字符串（选填）

要执行的代码。

type string （必填）

用作内容 oneof 的 OpenAPI 类型鉴别器。

一律设置为 "code_execution_call"

id string （选填）

此特定工具调用的唯一 ID。

CodeExecutionResultDelta

结果字符串（可选）

没有提供说明。

is_error boolean （选填）

没有提供说明。

签名字符串（选填）

没有提供说明。

type string （必填）

用作内容 oneof 的 OpenAPI 类型鉴别器。

一律设置为 "code_execution_result"

call_id string （可选）

用于与函数调用块中的 ID 相匹配的 ID。

UrlContextCallDelta

arguments UrlContextCallArguments （可选）

没有提供说明。

字段

urls array (string) （选填）

要提取的网址。

type string （必填）

用作内容 oneof 的 OpenAPI 类型鉴别器。

一律设置为 "url_context_call"

id string （选填）

此特定工具调用的唯一 ID。

UrlContextResultDelta

签名字符串（选填）

没有提供说明。

result UrlContextResult （可选）

没有提供说明。

字段

url string （选填）

提取的网址。

状态枚举（字符串）（选填）

网址检索的状态。

可能的值：

success
error
paywall
unsafe

is_error boolean （选填）

没有提供说明。

type string （必填）

用作内容 oneof 的 OpenAPI 类型鉴别器。

一律设置为 "url_context_result"

call_id string （可选）

用于与函数调用块中的 ID 相匹配的 ID。

GoogleSearchCallDelta

实参 GoogleSearchCallArguments （可选）

没有提供说明。

字段

查询数组（字符串）（可选）

后续网络搜索的网页搜索查询。

type string （必填）

用作内容 oneof 的 OpenAPI 类型鉴别器。

一律设置为 "google_search_call"

id string （选填）

此特定工具调用的唯一 ID。

GoogleSearchResultDelta

签名字符串（选填）

没有提供说明。

result GoogleSearchResult （可选）

没有提供说明。

字段

url string （选填）

搜索结果的 URI 引用。

标题字符串（选填）

搜索结果的标题。

rendered_content 字符串（可选）

可嵌入网页或应用 WebView 中的 Web 内容代码段。

is_error boolean （选填）

没有提供说明。

type string （必填）

用作内容 oneof 的 OpenAPI 类型鉴别器。

一律设置为 "google_search_result"

call_id string （可选）

用于与函数调用块中的 ID 相匹配的 ID。

McpServerToolCallDelta

名称字符串（可选）

没有提供说明。

server_name string （选填）

没有提供说明。

arguments object （可选）

没有提供说明。

type string （必填）

用作内容 oneof 的 OpenAPI 类型鉴别器。

一律设置为 "mcp_server_tool_call"

id string （选填）

此特定工具调用的唯一 ID。

McpServerToolResultDelta

名称字符串（可选）

没有提供说明。

server_name string （选填）

没有提供说明。

type string （必填）

用作内容 oneof 的 OpenAPI 类型鉴别器。

一律设置为 "mcp_server_tool_result"

结果对象或字符串（可选）

工具调用结果 delta。

call_id string （可选）

用于与函数调用块中的 ID 相匹配的 ID。

FileSearchResultDelta

result FileSearchResult （可选）

没有提供说明。

字段

标题字符串（选填）

搜索结果的标题。

文本字符串（选填）

搜索结果的文本。

file_search_store 字符串（可选）

文件搜索存储区的名称。

type string （必填）

用作内容 oneof 的 OpenAPI 类型鉴别器。

一律设置为 "file_search_result"

ContentStop

索引整数（可选）

没有提供说明。

event_type string （可选）

没有提供说明。

一律设置为 "content.stop"

event_id string （选填）

用于从相应事件恢复互动流的 event_id 令牌。

ErrorEvent

event_type string （可选）

没有提供说明。

一律设置为 "error"

error Error (optional)

没有提供说明。

字段

代码字符串（选填）

用于标识错误类型的 URI。

message string （选填）

人类可读的错误消息。

event_id string （选填）

用于从相应事件恢复互动流的 event_id 令牌。

示例

互动开始

{
  "event_type": "interaction.start",
  "interaction": {
    "id": "v1_ChdTMjQ0YWJ5TUF1TzcxZThQdjRpcnFRcxIXUzI0NGFieU1BdU83MWU4UHY0aXJxUXM",
    "model": "gemini-2.5-flash",
    "object": "interaction",
    "status": "in_progress"
  }
}

互动完成

{
  "event_type": "interaction.complete",
  "interaction": {
    "created": "2025-12-09T18:45:40Z",
    "id": "v1_ChdTMjQ0YWJ5TUF1TzcxZThQdjRpcnFRcxIXUzI0NGFieU1BdU83MWU4UHY0aXJxUXM",
    "model": "gemini-2.5-flash",
    "object": "interaction",
    "outputs": [
      {
        "signature": "CoMDAXLI2nynRYojJIy6B1Jh9os2crpWLfB0+19xcLsGG46bd8wjkF/6RNlRUdvHrXyjsHkG0BZFcuO/bPOyA6Xh5jANNgx82wPHjGExN8A4ZQn56FlMwyZoqFVQz0QyY1lfibFJ2zU3J87uw26OewzcuVX0KEcs+GIsZa3EA6WwqhbsOd3wtZB3Ua2Qf98VAWZTS5y/tWpql7jnU3/CU7pouxQr/Bwft3hwnJNesQ9/dDJTuaQ8Zprh9VRWf1aFFjpIueOjBRrlT3oW6/y/eRl/Gt9BQXCYTqg/38vHFUU4Wo/d9dUpvfCe/a3o97t2Jgxp34oFKcsVb4S5WJrykIkw+14DzVnTpCpbQNFckqvFLuqnJCkL0EQFtunBXI03FJpPu3T1XU6id8S7ojoJQZSauGUCgmaLqUGdMrd08oo81ecoJSLs51Re9N/lISGmjWFPGpqJLoGq6uo4FHz58hmeyXCgHG742BHz2P3MiH1CXHUT2J8mF6zLhf3SR9Qb3lkrobAh",
        "type": "thought"
      },
      {
        "text": "Elara\u2019s life was a symphony of quiet moments. A librarian, she found solace in the hushed aisles, the scent of aged paper, and the predictable rhythm of her days. Her small apartment, meticulously ordered, reflected this internal calm, save",
        "type": "text"
      },
      {
        "text": " for one beloved anomaly: a chipped porcelain teacup, inherited from her grandmother, which held her morning Earl Grey.\n\nOne Tuesday, stirring her tea, Elara paused. At the bottom, nestled against the porcelain, was a star.",
        "type": "text"
      },
      {
        "text": " Not a star-shaped tea leaf, but a miniature, perfectly formed celestial body, radiating a faint, cool luminescence. Before she could gasp, it dissolved, leaving only the amber swirl of her brew. She dismissed it as a trick of",
        "type": "text"
      },
      {
        "text": " tired eyes.\n\nBut the next morning, a gossamer-thin feather, smaller than an eyelash and shimmering with iridescent hues, floated on the surface. It vanished the moment she tried to touch it. A week later, a single,",
        "type": "text"
      },
      {
        "text": " impossibly delicate bloom, like spun moonbeam, unfolded in her cup before fading into nothingness.\n\nThese weren't illusions. Each day, Elara\u2019s chipped teacup offered a fleeting, exquisite secret. A tiny, perfect",
        "type": "text"
      },
      {
        "text": " crystal, a miniature spiral nebula, a fragment of rainbow caught in liquid form. They never lingered, never accumulated, simply *were* and then *weren't*, leaving behind a residue of quiet wonder.\n\nElara never spoke",
        "type": "text"
      },
      {
        "text": " of it. It was her private wellspring, a daily reminder that magic could exist in the smallest, most overlooked corners of the world. Her routine remained unchanged, her external life a picture of calm, but inside, a secret garden blo",
        "type": "text"
      },
      {
        "text": "omed. Each dawn brought not just tea, but the silent promise of extraordinary beauty, waiting patiently in a chipped teacup.",
        "type": "text"
      }
    ],
    "role": "model",
    "status": "completed",
    "updated": "2025-12-09T18:45:40Z",
    "usage": {
      "input_tokens_by_modality": [
        {
          "modality": "text",
          "tokens": 11
        }
      ],
      "total_cached_tokens": 0,
      "total_input_tokens": 11,
      "total_output_tokens": 364,
      "total_reasoning_tokens": 1120,
      "total_tokens": 1495,
      "total_tool_use_tokens": 0
    }
  }
}

互动状态更新

{
  "event_type": "interaction.status_update",
  "interaction_id": "v1_ChdTMjQ0YWJ5TUF1TzcxZThQdjRpcnFRcxIXUzI0NGFieU1BdU83MWU4UHY0aXJxUXM",
  "status": "in_progress"
}

内容开始

{
  "event_type": "content.start",
  "content": {
    "type": "text"
  },
  "index": 1
}

内容增量

{
  "event_type": "content.delta",
  "delta": {
    "type": "text",
    "text": "Elara\u2019s life was a symphony of quiet moments. A librarian, she found solace in the hushed aisles, the scent of aged paper, and the predictable rhythm of her days. Her small apartment, meticulously ordered, reflected this internal calm, save"
  },
  "index": 1
}

内容停止

{
  "event_type": "content.stop",
  "index": 1
}

错误事件

{
  "event_type": "error",
  "error": {
    "message": "Failed to get completed interaction: Result not found.",
    "code": "not_found"
  }
}