了解词元并计算词元数量

Gemini 和其他生成式 AI 模型以称为“token”的精细粒度处理输入和输出。

令牌简介

令牌可以是单个字符（例如 z），也可以是完整字词（例如 cat）。长单词会被拆分成多个词元。模型使用的所有令牌集称为词汇，将文本拆分为令牌的过程称为令牌化。

对于 Gemini 模型，一个 token 大约相当于 4 个字符。100 个词元相当于大约 60-80 个英语单词。

启用结算功能后，调用 Gemini API 的费用在一定程度上取决于输入和输出令牌的数量，因此了解如何统计令牌会很有帮助。

统计词元数

所有输入 Gemini API 和从 Gemini API 输出的内容都会被标记化，包括文本、图片文件和其他非文本模态。

您可以通过以下方式统计令牌数：

使用请求的输入调用 countTokens。
此函数会返回仅输入中的令牌总数。您可以在将输入发送到模型之前进行此调用，以检查请求的大小。
在调用 generate_content 后，对 response 对象使用 usageMetadata 属性。
此函数会返回输入和输出中的词元总数：totalTokenCount。
它还会分别返回输入和输出的令牌数：promptTokenCount（输入令牌）和candidatesTokenCount（输出令牌）。

统计文本词元数

如果您使用纯文本输入调用 countTokens，它会返回仅输入 (totalTokens) 中文本的令牌数。您可以在调用 generateContent 之前进行此调用，以检查请求的大小。

另一种方法是调用 generateContent，然后对 response 对象使用 usageMetadata 属性，以获取以下内容：

输入 (promptTokenCount) 和输出 (candidatesTokenCount) 的单独词元数
输入和输出中的总词元数 (totalTokenCount)

// Make sure to include the following import:
// import {GoogleGenAI} from '@google/genai';
const ai = new GoogleGenAI({ apiKey: process.env.GEMINI_API_KEY });
const prompt = "The quick brown fox jumps over the lazy dog.";
const countTokensResponse = await ai.models.countTokens({
  model: "gemini-2.0-flash",
  contents: prompt,
});
console.log(countTokensResponse.totalTokens);

const generateResponse = await ai.models.generateContent({
  model: "gemini-2.0-flash",
  contents: prompt,
});
console.log(generateResponse.usageMetadata);count_tokens.js

统计多轮（聊天）token

如果您使用聊天记录调用 countTokens，则会返回聊天中每个角色的文本的总令牌数 (totalTokens)。

另一种方法是调用 sendMessage，然后对 response 对象使用 usageMetadata 属性，以获取以下内容：

输入 (promptTokenCount) 和输出 (candidatesTokenCount) 的单独词元数
输入和输出中的总词元数 (totalTokenCount)

如需了解下一个对话回合有多长，您需要在调用 countTokens 时将其附加到历史记录中。

// Make sure to include the following import:
// import {GoogleGenAI} from '@google/genai';
const ai = new GoogleGenAI({ apiKey: process.env.GEMINI_API_KEY });
// Initial chat history.
const history = [
  { role: "user", parts: [{ text: "Hi my name is Bob" }] },
  { role: "model", parts: [{ text: "Hi Bob!" }] },
];
const chat = ai.chats.create({
  model: "gemini-2.0-flash",
  history: history,
});

// Count tokens for the current chat history.
const countTokensResponse = await ai.models.countTokens({
  model: "gemini-2.0-flash",
  contents: chat.getHistory(),
});
console.log(countTokensResponse.totalTokens);

const chatResponse = await chat.sendMessage({
  message: "In one sentence, explain how a computer works to a young child.",
});
console.log(chatResponse.usageMetadata);

// Add an extra user message to the history.
const extraMessage = {
  role: "user",
  parts: [{ text: "What is the meaning of life?" }],
};
const combinedHistory = chat.getHistory();
combinedHistory.push(extraMessage);
const combinedCountTokensResponse = await ai.models.countTokens({
  model: "gemini-2.0-flash",
  contents: combinedHistory,
});
console.log(
  "Combined history token count:",
  combinedCountTokensResponse.totalTokens,
);count_tokens.js

统计多模态令牌

向 Gemini API 的所有输入都会进行令牌化处理，包括文本、图片文件和其他非文本模态。请注意以下关于 Gemini API 处理期间多模态输入令牌化的一些要点：

在 Gemini 2.0 中，如果图片输入的两个维度均小于或等于 384 像素，则计为 258 个 token。如果图片的一维或二维尺寸较大，系统会根据需要将其剪裁并缩放为 768x768 像素的图块，每个图块计为 258 个令牌。在 Gemini 2.0 之前，图片使用固定的 258 个令牌。
视频和音频文件会以以下固定速率转换为令牌：视频为每秒 263 个令牌，音频为每秒 32 个令牌。

图片文件

如果您使用文本和图片输入调用 countTokens，它会在仅输入 (totalTokens) 中返回文本和图片的总令牌数。您可以在调用 generateContent 之前进行此调用，以检查请求的大小。您还可以选择分别对文本和文件调用 countTokens。

另一种方法是调用 generateContent，然后对 response 对象使用 usageMetadata 属性，以获取以下内容：

输入 (promptTokenCount) 和输出 (candidatesTokenCount) 的单独词元数
输入和输出中的总词元数 (totalTokenCount)

使用 File API 上传的图片的示例：

// Make sure to include the following import:
// import {GoogleGenAI} from '@google/genai';
const ai = new GoogleGenAI({ apiKey: process.env.GEMINI_API_KEY });
const prompt = "Tell me about this image";
const organ = await ai.files.upload({
  file: path.join(media, "organ.jpg"),
  config: { mimeType: "image/jpeg" },
});

const countTokensResponse = await ai.models.countTokens({
  model: "gemini-2.0-flash",
  contents: createUserContent([
    prompt,
    createPartFromUri(organ.uri, organ.mimeType),
  ]),
});
console.log(countTokensResponse.totalTokens);

const generateResponse = await ai.models.generateContent({
  model: "gemini-2.0-flash",
  contents: createUserContent([
    prompt,
    createPartFromUri(organ.uri, organ.mimeType),
  ]),
});
console.log(generateResponse.usageMetadata);count_tokens.js

将图片作为内嵌数据提供的示例：

// Make sure to include the following import:
// import {GoogleGenAI} from '@google/genai';
const ai = new GoogleGenAI({ apiKey: process.env.GEMINI_API_KEY });
const prompt = "Tell me about this image";
const imageBuffer = fs.readFileSync(path.join(media, "organ.jpg"));

// Convert buffer to base64 string.
const imageBase64 = imageBuffer.toString("base64");

// Build contents using createUserContent and createPartFromBase64.
const contents = createUserContent([
  prompt,
  createPartFromBase64(imageBase64, "image/jpeg"),
]);

const countTokensResponse = await ai.models.countTokens({
  model: "gemini-2.0-flash",
  contents: contents,
});
console.log(countTokensResponse.totalTokens);

const generateResponse = await ai.models.generateContent({
  model: "gemini-2.0-flash",
  contents: contents,
});
console.log(generateResponse.usageMetadata);count_tokens.js

视频或音频文件

音频和视频分别以以下固定费率转换为代币：

视频：每秒 263 个令牌
音频：每秒 32 个令牌

如果您使用文本和视频/音频输入调用 countTokens，它会在仅输入 (totalTokens) 中返回文本和视频/音频文件的总令牌数。您可以在调用 generateContent 之前进行此调用，以检查请求的大小。您还可以选择分别对文本和文件调用 countTokens。

另一种方法是调用 generateContent，然后对 response 对象使用 usageMetadata 属性，以获取以下内容：

输入 (promptTokenCount) 和输出 (candidatesTokenCount) 的单独词元数
输入和输出中的总词元数 (totalTokenCount)

// Make sure to include the following import:
// import {GoogleGenAI} from '@google/genai';
const ai = new GoogleGenAI({ apiKey: process.env.GEMINI_API_KEY });
const prompt = "Tell me about this video";
let videoFile = await ai.files.upload({
  file: path.join(media, "Big_Buck_Bunny.mp4"),
  config: { mimeType: "video/mp4" },
});

// Poll until the video file is completely processed (state becomes ACTIVE).
while (!videoFile.state || videoFile.state.toString() !== "ACTIVE") {
  console.log("Processing video...");
  console.log("File state: ", videoFile.state);
  await sleep(5000);
  videoFile = await ai.files.get({ name: videoFile.name });
}

const countTokensResponse = await ai.models.countTokens({
  model: "gemini-2.0-flash",
  contents: createUserContent([
    prompt,
    createPartFromUri(videoFile.uri, videoFile.mimeType),
  ]),
});
console.log(countTokensResponse.totalTokens);

const generateResponse = await ai.models.generateContent({
  model: "gemini-2.0-flash",
  contents: createUserContent([
    prompt,
    createPartFromUri(videoFile.uri, videoFile.mimeType),
  ]),
});
console.log(generateResponse.usageMetadata);count_tokens.js

系统说明和工具

系统说明和工具也会计入输入的总令牌数。

如果您使用系统说明，totalTokens 计数会增加，以反映 systemInstruction 的添加。

如果您使用函数调用，totalTokens 计数会增加，以反映 tools 的添加。