Gemini Deep Research 现已推出预览版，支持协作规划、可视化、MCP 等功能。

Google uses AI technology to translate content into your preferred language. AI translations can contain errors.

了解和统计 token 数量

注意：此版本的页面介绍了新的 Interactions API，该 API 目前处于 Beta 版阶段。
对于稳定的生产部署，我们建议您继续使用 generateContent API。您可以使用此页面上的切换开关在不同版本之间切换。

Gemini 和其他生成式 AI 模型以称为“token”的粒度处理输入和输出。

对于 Gemini 模型，一个 token 大致相当于 4 个字符。 100 个 token 大致相当于 60-80 个英文单词。

关于 token

token 可以是单个字符（例如 z），也可以是整个单词（例如 cat）。长单词会被拆分为多个 token。模型使用的所有 token 的集合称为词汇，将文本拆分为 token 的过程称为“词元化” 。

启用结算后，调用 Gemini API 的费用部分取决于输入和输出 token 的数量，因此了解如何统计 token 数量可能会很有帮助。

统计 token 数量

Gemini API 的所有输入和输出（包括文本、图片文件和其他非文本模态）都会进行 token 化。

您可以通过以下方式统计 token 数量：

使用请求的输入调用count_tokens。返回 仅输入 中的 token 总数。在发送输入之前进行此调用，以检查请求的大小。
在互动响应中使用usage。返回输入 (total_input_tokens)、输出 (total_output_tokens)、思考 (total_thought_tokens)、缓存内容 (total_cached_tokens)、工具使用 (total_tool_use_tokens) 和总数 (total_tokens) 的 token 计数。

统计文本 token 数量

Python

from google import genai

client = genai.Client()
prompt = "The quick brown fox jumps over the lazy dog."

# Count tokens before sending
total_tokens = client.models.count_tokens(
    model="gemini-3-flash-preview",
    contents=prompt
)
print("total_tokens:", total_tokens)

# Get usage from interaction
interaction = client.interactions.create(
    model="gemini-3-flash-preview",
    input=prompt
)
print(interaction.usage)

JavaScript

import { GoogleGenAI } from '@google/genai';

const client = new GoogleGenAI({});
const prompt = "The quick brown fox jumps over the lazy dog.";

// Count tokens before sending
const countResponse = await client.models.countTokens({
    model: "gemini-3-flash-preview",
    contents: prompt,
});
console.log(countResponse.totalTokens);

// Get usage from interaction
const interaction = await client.interactions.create({
    model: "gemini-3-flash-preview",
    input: prompt,
});
console.log(interaction.usage);

REST

curl -X POST "https://generativelanguage.googleapis.com/v1beta/models/gemini-3-flash-preview:countTokens" \
  -H "x-goog-api-key: $GEMINI_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"contents": [{"parts": [{"text": "The quick brown fox."}]}]}'

统计多轮对话 token 数量

使用 previous_interaction_id 统计整个对话记录中的 token 数量：

Python

# First interaction
interaction1 = client.interactions.create(
    model="gemini-3-flash-preview",
    input="Hi, my name is Bob"
)

# Second interaction continues the conversation
interaction2 = client.interactions.create(
    model="gemini-3-flash-preview",
    input="What's my name?",
    previous_interaction_id=interaction1.id
)

# Usage includes tokens from both turns
print(f"Input tokens: {interaction2.usage.total_input_tokens}")
print(f"Output tokens: {interaction2.usage.total_output_tokens}")
print(f"Total tokens: {interaction2.usage.total_tokens}")

JavaScript

// First interaction
const interaction1 = await client.interactions.create({
    model: "gemini-3-flash-preview",
    input: "Hi, my name is Bob"
});

// Second interaction continues the conversation
const interaction2 = await client.interactions.create({
    model: "gemini-3-flash-preview",
    input: "What's my name?",
    previousInteractionId: interaction1.id
});

console.log(`Input tokens: ${interaction2.usage.totalInputTokens}`);
console.log(`Output tokens: ${interaction2.usage.totalOutputTokens}`);

统计多模态 token 数量

Gemini API 的所有输入（包括图片、视频和音频）都会进行 token 化。关于 token 化的要点：

图片：如果图片的两个尺寸均小于或等于 384 像素，则计为 258 个 token。较大的图片会被平铺为 768x768 像素的图块，每个图块计为 258 个 token。
视频：每秒 263 个 token
音频：每秒 32 个 token

图片 token

Python

uploaded_file = client.files.upload(file="path/to/image.jpg")

# Count tokens for image + text
total_tokens = client.models.count_tokens(
    model="gemini-3-flash-preview",
    contents=["Tell me about this image", uploaded_file]
)
print(f"Total tokens: {total_tokens}")

# Generate with image
interaction = client.interactions.create(
    model="gemini-3-flash-preview",
    input=[
        {"type": "text", "text": "Tell me about this image"},
        {"type": "image", "uri": uploaded_file.uri, "mime_type": uploaded_file.mime_type}
    ]
)
print(interaction.usage)

JavaScript

const uploadedFile = await client.files.upload({
    file: "path/to/image.jpg",
    config: { mimeType: "image/jpeg" }
});

// Count tokens
const countResponse = await client.models.countTokens({
    model: "gemini-3-flash-preview",
    contents: [
        { text: "Tell me about this image" },
        { fileData: { fileUri: uploadedFile.uri, mimeType: uploadedFile.mimeType } }
    ]
});
console.log(countResponse.totalTokens);

内嵌数据示例：

Python

import base64

with open('image.jpg', 'rb') as f:
    image_bytes = f.read()

interaction = client.interactions.create(
    model="gemini-3-flash-preview",
    input=[
        {"type": "text", "text": "Describe this image"},
        {
            "type": "image",
            "data": base64.b64encode(image_bytes).decode('utf-8'),
            "mime_type": "image/jpeg"
        }
    ]
)
print(interaction.usage)

视频 token

Python

import time

video_file = client.files.upload(file="path/to/video.mp4")

while not video_file.state or video_file.state.name != "ACTIVE":
    print("Processing video...")
    time.sleep(5)
    video_file = client.files.get(name=video_file.name)

# A 60-second video is approximately 263 * 60 = 15,780 tokens
total_tokens = client.models.count_tokens(
    model="gemini-3-flash-preview",
    contents=["Summarize this video", video_file]
)
print(f"Total tokens: {total_tokens}")

# Generate with video
interaction = client.interactions.create(
    model="gemini-3-flash-preview",
    input=[
        {"type": "text", "text": "Summarize this video"},
        {"type": "video", "uri": video_file.uri, "mime_type": video_file.mime_type}
    ]
)
print(interaction.usage)

音频 token

Python

audio_file = client.files.upload(file="path/to/audio.mp3")

# A 60-second audio clip is approximately 32 * 60 = 1,920 tokens
total_tokens = client.models.count_tokens(
    model="gemini-3-flash-preview",
    contents=["Transcribe this audio", audio_file]
)
print(f"Total tokens: {total_tokens}")

# Generate with audio
interaction = client.interactions.create(
    model="gemini-3-flash-preview",
    input=[
        {"type": "text", "text": "Transcribe this audio"},
        {"type": "audio", "uri": audio_file.uri, "mime_type": audio_file.mime_type}
    ]
)
print(interaction.usage)

统计系统说明 token 数量

系统说明计为输入 token 的一部分：

Python

interaction = client.interactions.create(
    model="gemini-3-flash-preview",
    input="Hello!",
    system_instruction="You are a helpful assistant who speaks like a pirate."
)

# system_instruction tokens included in total_input_tokens
print(f"Input tokens: {interaction.usage.total_input_tokens}")

统计工具 token 数量

工具（函数、代码执行、Google 搜索）也会进行统计：

Python

tools = [
    {
        "type": "function",
        "name": "get_weather",
        "description": "Get current weather",
        "parameters": {
            "type": "object",
            "properties": {
                "location": {"type": "string"}
            }
        }
    }
]

interaction = client.interactions.create(
    model="gemini-3-flash-preview",
    input="What's the weather in Tokyo?",
    tools=tools
)

print(f"Input tokens: {interaction.usage.total_input_tokens}")
print(f"Tool use tokens: {interaction.usage.total_tool_use_tokens}")

上下文窗口

每个 Gemini 模型都有可以处理的 token 数量上限。上下文窗口定义了输入和输出 token 的组合限制。

以编程方式获取上下文窗口大小

Python

model_info = client.models.get(model="gemini-3-flash-preview")
print(f"Input token limit: {model_info.input_token_limit}")
print(f"Output token limit: {model_info.output_token_limit}")

JavaScript

const modelInfo = await client.models.get({ model: "gemini-3-flash-preview" });
console.log(`Input token limit: ${modelInfo.inputTokenLimit}`);
console.log(`Output token limit: ${modelInfo.outputTokenLimit}`);

在模型页面上查找上下文窗口大小。

后续步骤

文本生成：生成基础知识
缓存：通过缓存降低费用
价格：了解费用