Counting tokens

Method: models.countTokens

Runs a model's tokenizer on input content and returns the token count.

Endpoint

post https://generativelanguage.googleapis.com/v1beta/{model=models/*}:countTokens

Path parameters

model string

Required. The model's resource name. This serves as an ID for the Model to use.

This name should match a model name returned by the models.list method.

Format: models/{model} It takes the form models/{model}.

Request body

The request body contains data with the following structure:

Fields
contents[] object (Content)

Optional. The input given to the model as a prompt. This field is ignored when generateContentRequest is set.

generateContentRequest object (GenerateContentRequest)

Optional. The overall input given to the model. models.countTokens will count prompt, function calling, etc.

Example request

Text

Python

model = genai.GenerativeModel("models/gemini-1.5-flash")

prompt = "The quick brown fox jumps over the lazy dog."

# Call `count_tokens` to get the input token count (`total_tokens`).
print("total_tokens: ", model.count_tokens(prompt))
# ( total_tokens: 10 )

response = model.generate_content(prompt)

# On the response for `generate_content`, use `usage_metadata`
# to get separate input and output token counts
# (`prompt_token_count` and `candidates_token_count`, respectively),
# as well as the combined token count (`total_token_count`).
print(response.usage_metadata)
# ( prompt_token_count: 11, candidates_token_count: 73, total_token_count: 84 )

Node.js

// Make sure to include these imports:
// import { GoogleGenerativeAI } from "@google/generative-ai";
const genAI = new GoogleGenerativeAI(process.env.API_KEY);
const model = genAI.getGenerativeModel({
  model: "gemini-1.5-flash",
});

// Count tokens in a prompt without calling text generation.
const countResult = await model.countTokens(
  "The quick brown fox jumps over the lazy dog.",
);

console.log(countResult.totalTokens); // 11

const generateResult = await model.generateContent(
  "The quick brown fox jumps over the lazy dog.",
);

// On the response for `generateContent`, use `usageMetadata`
// to get separate input and output token counts
// (`promptTokenCount` and `candidatesTokenCount`, respectively),
// as well as the combined token count (`totalTokenCount`).
console.log(generateResult.response.usageMetadata);
// { promptTokenCount: 11, candidatesTokenCount: 131, totalTokenCount: 142 }

Shell

curl https://generativelanguage.googleapis.com/v1beta/models/gemini-1.5-flash:countTokens?key=$GOOGLE_API_KEY \
    -H 'Content-Type: application/json' \
    -X POST \
    -d '{
      "contents": [{
        "parts":[{
          "text": "The quick brown fox jumps over the lazy dog."
          }],
        }],
      }'

Kotlin

val generativeModel =
    GenerativeModel(
        // Specify a Gemini model appropriate for your use case
        modelName = "gemini-1.5-flash",
        // Access your API key as a Build Configuration variable (see "Set up your API key" above)
        apiKey = BuildConfig.apiKey)

// For text-only input
val (totalTokens) = generativeModel.countTokens("Write a story about a magic backpack.")
print(totalTokens)

Swift

let generativeModel =
  GenerativeModel(
    // Specify a Gemini model appropriate for your use case
    name: "gemini-1.5-flash",
    // Access your API key from your on-demand resource .plist file (see "Set up your API key"
    // above)
    apiKey: APIKey.default
  )

let prompt = "Write a story about a magic backpack."

let response = try await generativeModel.countTokens(prompt)

print("Total Tokens: \(response.totalTokens)")

Dart

final model = GenerativeModel(
  model: 'gemini-1.5-flash',
  apiKey: apiKey,
);
final prompt = 'The quick brown fox jumps over the lazy dog.';
final tokenCount = await model.countTokens([Content.text(prompt)]);
print('Total tokens: ${tokenCount.totalTokens}');

Java

// Specify a Gemini model appropriate for your use case
GenerativeModel gm =
    new GenerativeModel(
        /* modelName */ "gemini-1.5-flash",
        // Access your API key as a Build Configuration variable (see "Set up your API key"
        // above)
        /* apiKey */ BuildConfig.apiKey);
GenerativeModelFutures model = GenerativeModelFutures.from(gm);

Content inputContent =
    new Content.Builder().addText("Write a story about a magic backpack.").build();

// For illustrative purposes only. You should use an executor that fits your needs.
Executor executor = Executors.newSingleThreadExecutor();

// For text-only input
ListenableFuture<CountTokensResponse> countTokensResponse = model.countTokens(inputContent);

Futures.addCallback(
    countTokensResponse,
    new FutureCallback<CountTokensResponse>() {
      @Override
      public void onSuccess(CountTokensResponse result) {
        int totalTokens = result.getTotalTokens();
        System.out.println("TotalTokens = " + totalTokens);
      }

      @Override
      public void onFailure(Throwable t) {
        t.printStackTrace();
      }
    },
    executor);

Chat

Python

model = genai.GenerativeModel("models/gemini-1.5-flash")

chat = model.start_chat(
    history=[
        {"role": "user", "parts": "Hi my name is Bob"},
        {"role": "model", "parts": "Hi Bob!"},
    ]
)
# Call `count_tokens` to get the input token count (`total_tokens`).
print(model.count_tokens(chat.history))
# ( total_tokens: 10 )

response = chat.send_message(
    "In one sentence, explain how a computer works to a young child."
)

# On the response for `send_message`, use `usage_metadata`
# to get separate input and output token counts
# (`prompt_token_count` and `candidates_token_count`, respectively),
# as well as the combined token count (`total_token_count`).
print(response.usage_metadata)
# ( prompt_token_count: 25, candidates_token_count: 21, total_token_count: 46 )

from google.generativeai.types.content_types import to_contents

# You can call `count_tokens` on the combined history and content of the next turn.
print(model.count_tokens(chat.history + to_contents("What is the meaning of life?")))
# ( total_tokens: 56 )

Node.js

// Make sure to include these imports:
// import { GoogleGenerativeAI } from "@google/generative-ai";
const genAI = new GoogleGenerativeAI(process.env.API_KEY);
const model = genAI.getGenerativeModel({
  model: "gemini-1.5-flash",
});

const chat = model.startChat({
  history: [
    {
      role: "user",
      parts: [{ text: "Hi my name is Bob" }],
    },
    {
      role: "model",
      parts: [{ text: "Hi Bob!" }],
    },
  ],
});

const countResult = await model.countTokens({
  generateContentRequest: { contents: await chat.getHistory() },
});
console.log(countResult.totalTokens); // 10

const chatResult = await chat.sendMessage(
  "In one sentence, explain how a computer works to a young child.",
);

// On the response for `sendMessage`, use `usageMetadata`
// to get separate input and output token counts
// (`promptTokenCount` and `candidatesTokenCount`, respectively),
// as well as the combined token count (`totalTokenCount`).
console.log(chatResult.response.usageMetadata);
// { promptTokenCount: 25, candidatesTokenCount: 22, totalTokenCount: 47 }

Shell

curl https://generativelanguage.googleapis.com/v1beta/models/gemini-1.5-flash:countTokens?key=$GOOGLE_API_KEY \
    -H 'Content-Type: application/json' \
    -X POST \
    -d '{
      "contents": [
        {"role": "user",
        "parts": [{"text": "Hi, my name is Bob."}],
        },
        {"role": "model",
         "parts":[{"text": "Hi Bob"}],
        },
      ],
      }'

Kotlin

val generativeModel =
    GenerativeModel(
        // Specify a Gemini model appropriate for your use case
        modelName = "gemini-1.5-flash",
        // Access your API key as a Build Configuration variable (see "Set up your API key" above)
        apiKey = BuildConfig.apiKey)

val chat =
    generativeModel.startChat(
        history =
            listOf(
                content(role = "user") { text("Hello, I have 2 dogs in my house.") },
                content(role = "model") {
                  text("Great to meet you. What would you like to know?")
                }))

val history = chat.history
val messageContent = content { text("This is the message I intend to send") }
val (totalTokens) = generativeModel.countTokens(*history.toTypedArray(), messageContent)
print(totalTokens)

Swift

let generativeModel =
  GenerativeModel(
    // Specify a Gemini model appropriate for your use case
    name: "gemini-1.5-flash",
    // Access your API key from your on-demand resource .plist file (see "Set up your API key"
    // above)
    apiKey: APIKey.default
  )

// Optionally specify existing chat history
let history = [
  ModelContent(role: "user", parts: "Hello, I have 2 dogs in my house."),
  ModelContent(role: "model", parts: "Great to meet you. What would you like to know?"),
]

// Initialize the chat with optional chat history
let chat = generativeModel.startChat(history: history)

let response = try await generativeModel.countTokens(chat.history + [
  ModelContent(role: "user", parts: "This is the message I intend to send"),
])
print("Total Tokens: \(response.totalTokens)")

Dart

final model = GenerativeModel(
  model: 'gemini-1.5-flash',
  apiKey: apiKey,
);
final chat = model.startChat(history: [
  Content.text('Hi my name is Bob'),
  Content.model([TextPart('Hi Bob!')])
]);
var tokenCount = await model.countTokens(chat.history);
print('Total tokens: ${tokenCount.totalTokens}');

final response = await chat.sendMessage(Content.text(
    'In one sentence, explain how a computer works to a young child.'));
if (response.usageMetadata case final usage?) {
  print('Prompt: ${usage.promptTokenCount}, '
      'Candidates: ${usage.candidatesTokenCount}, '
      'Total: ${usage.totalTokenCount}');
}

tokenCount = await model.countTokens(
    [...chat.history, Content.text('What is the meaning of life?')]);
print('Total tokens: ${tokenCount.totalTokens}');

Java

// Specify a Gemini model appropriate for your use case
GenerativeModel gm =
    new GenerativeModel(
        /* modelName */ "gemini-1.5-flash",
        // Access your API key as a Build Configuration variable (see "Set up your API key"
        // above)
        /* apiKey */ BuildConfig.apiKey);
GenerativeModelFutures model = GenerativeModelFutures.from(gm);

// (optional) Create previous chat history for context
Content.Builder userContentBuilder = new Content.Builder();
userContentBuilder.setRole("user");
userContentBuilder.addText("Hello, I have 2 dogs in my house.");
Content userContent = userContentBuilder.build();

Content.Builder modelContentBuilder = new Content.Builder();
modelContentBuilder.setRole("model");
modelContentBuilder.addText("Great to meet you. What would you like to know?");
Content modelContent = userContentBuilder.build();

List<Content> history = Arrays.asList(userContent, modelContent);

// Initialize the chat
ChatFutures chat = model.startChat(history);

Content messageContent =
    new Content.Builder().addText("This is the message I intend to send").build();

Collections.addAll(history, messageContent);

// For illustrative purposes only. You should use an executor that fits your needs.
Executor executor = Executors.newSingleThreadExecutor();

ListenableFuture<CountTokensResponse> countTokensResponse =
    model.countTokens(history.toArray(new Content[0]));
Futures.addCallback(
    countTokensResponse,
    new FutureCallback<CountTokensResponse>() {
      @Override
      public void onSuccess(CountTokensResponse result) {
        System.out.println(result);
      }

      @Override
      public void onFailure(Throwable t) {
        t.printStackTrace();
      }
    },
    executor);

Inline media

Python

import PIL.Image

model = genai.GenerativeModel("models/gemini-1.5-flash")

prompt = "Tell me about this image"
your_image_file = PIL.Image.open("image.jpg")

# Call `count_tokens` to get the input token count
# of the combined text and file (`total_tokens`).
# An image's display or file size does not affect its token count.
# Optionally, you can call `count_tokens` for the text and file separately.
print(model.count_tokens([prompt, your_image_file]))
# ( total_tokens: 263 )

response = model.generate_content([prompt, your_image_file])

# On the response for `generate_content`, use `usage_metadata`
# to get separate input and output token counts
# (`prompt_token_count` and `candidates_token_count`, respectively),
# as well as the combined token count (`total_token_count`).
print(response.usage_metadata)
# ( prompt_token_count: 264, candidates_token_count: 80, total_token_count: 345 )

Node.js

// Make sure to include these imports:
// import { GoogleGenerativeAI } from "@google/generative-ai";
const genAI = new GoogleGenerativeAI(process.env.API_KEY);
const model = genAI.getGenerativeModel({
  model: "gemini-1.5-flash",
});

function fileToGenerativePart(path, mimeType) {
  return {
    inlineData: {
      data: Buffer.from(fs.readFileSync(path)).toString("base64"),
      mimeType,
    },
  };
}

const imagePart = fileToGenerativePart(
  `${mediaPath}/jetpack.jpg`,
  "image/jpeg",
);

const prompt = "Tell me about this image.";

// Call `countTokens` to get the input token count
// of the combined text and file (`totalTokens`).
// An image's display or file size does not affect its token count.
// Optionally, you can call `countTokens` for the text and file separately.
const countResult = await model.countTokens([prompt, imagePart]);
console.log(countResult.totalTokens); // 265

const generateResult = await model.generateContent([prompt, imagePart]);

// On the response for `generateContent`, use `usageMetadata`
// to get separate input and output token counts
// (`promptTokenCount` and `candidatesTokenCount`, respectively),
// as well as the combined token count (`totalTokenCount`).
console.log(generateResult.response.usageMetadata);
// { promptTokenCount: 265, candidatesTokenCount: 157, totalTokenCount: 422 }

Kotlin

val generativeModel =
    GenerativeModel(
        // Specify a Gemini model appropriate for your use case
        modelName = "gemini-1.5-flash",
        // Access your API key as a Build Configuration variable (see "Set up your API key" above)
        apiKey = BuildConfig.apiKey)

val image1: Bitmap = BitmapFactory.decodeResource(context.resources, R.drawable.image1)
val image2: Bitmap = BitmapFactory.decodeResource(context.resources, R.drawable.image2)

val multiModalContent = content {
  image(image1)
  image(image2)
  text("What's the difference between these pictures?")
}

val (totalTokens) = generativeModel.countTokens(multiModalContent)
print(totalTokens)

Swift

let generativeModel =
  GenerativeModel(
    // Specify a Gemini model appropriate for your use case
    name: "gemini-1.5-flash",
    // Access your API key from your on-demand resource .plist file (see "Set up your API key"
    // above)
    apiKey: APIKey.default
  )

guard let image1 = UIImage(systemName: "cloud.sun") else { fatalError() }
guard let image2 = UIImage(systemName: "cloud.heavyrain") else { fatalError() }

let prompt = "What's the difference between these pictures?"

let response = try await generativeModel.countTokens(image1, image2, prompt)
print("Total Tokens: \(response.totalTokens)")

Dart

final model = GenerativeModel(
  model: 'gemini-1.5-flash',
  apiKey: apiKey,
);

Future<DataPart> fileToPart(String mimeType, String path) async {
  return DataPart(mimeType, await File(path).readAsBytes());
}

final prompt = 'Tell me about this image';
final image = await fileToPart('image/jpeg', 'resources/organ.jpg');
final content = Content.multi([TextPart(prompt), image]);

// An image's display size does not affet its token count.
// Optionally, you can call `countTokens` for the prompt and file separately.
final tokenCount = await model.countTokens([content]);
print('Total tokens: ${tokenCount.totalTokens}');

final response = await model.generateContent([content]);
if (response.usageMetadata case final usage?) {
  print('Prompt: ${usage.promptTokenCount}, '
      'Candidates: ${usage.candidatesTokenCount}, '
      'Total: ${usage.totalTokenCount}');
}

Java

// Specify a Gemini model appropriate for your use case
GenerativeModel gm =
    new GenerativeModel(
        /* modelName */ "gemini-1.5-flash",
        // Access your API key as a Build Configuration variable (see "Set up your API key"
        // above)
        /* apiKey */ BuildConfig.apiKey);
GenerativeModelFutures model = GenerativeModelFutures.from(gm);
Content text = new Content.Builder().addText("Write a story about a magic backpack.").build();

// For illustrative purposes only. You should use an executor that fits your needs.
Executor executor = Executors.newSingleThreadExecutor();

// For text-and-image input
Bitmap image1 = BitmapFactory.decodeResource(context.getResources(), R.drawable.image1);
Bitmap image2 = BitmapFactory.decodeResource(context.getResources(), R.drawable.image2);

Content multiModalContent =
    new Content.Builder()
        .addImage(image1)
        .addImage(image2)
        .addText("What's different between these pictures?")
        .build();

ListenableFuture<CountTokensResponse> countTokensResponse =
    model.countTokens(multiModalContent);

Futures.addCallback(
    countTokensResponse,
    new FutureCallback<CountTokensResponse>() {
      @Override
      public void onSuccess(CountTokensResponse result) {
        int totalTokens = result.getTotalTokens();
        System.out.println("TotalTokens = " + totalTokens);
      }

      @Override
      public void onFailure(Throwable t) {
        t.printStackTrace();
      }
    },
    executor);

Files

Python

import time

model = genai.GenerativeModel("models/gemini-1.5-flash")

prompt = "Tell me about this video"
your_file = genai.upload_file(path=media / "Big_Buck_Bunny.mp4")

# Videos need to be processed before you can use them.
while your_file.state.name == "PROCESSING":
    print("processing video...")
    time.sleep(5)
    your_file = genai.get_file(your_file.name)

# Call `count_tokens` to get the input token count
# of the combined text and video/audio file (`total_tokens`).
# A video or audio file is converted to tokens at a fixed rate of tokens per second.
# Optionally, you can call `count_tokens` for the text and file separately.
print(model.count_tokens([prompt, your_file]))
# ( total_tokens: 300 )

response = model.generate_content([prompt, your_file])

# On the response for `generate_content`, use `usage_metadata`
# to get separate input and output token counts
# (`prompt_token_count` and `candidates_token_count`, respectively),
# as well as the combined token count (`total_token_count`).
print(response.usage_metadata)
# ( prompt_token_count: 301, candidates_token_count: 60, total_token_count: 361 )

Node.js

// Make sure to include these imports:
// import { GoogleAIFileManager, FileState } from "@google/generative-ai/server";
// import { GoogleGenerativeAI } from "@google/generative-ai";
const fileManager = new GoogleAIFileManager(process.env.API_KEY);

const uploadVideoResult = await fileManager.uploadFile(
  `${mediaPath}/Big_Buck_Bunny.mp4`,
  { mimeType: "video/mp4" },
);

let file = await fileManager.getFile(uploadVideoResult.file.name);
process.stdout.write("processing video");
while (file.state === FileState.PROCESSING) {
  process.stdout.write(".");
  // Sleep for 10 seconds
  await new Promise((resolve) => setTimeout(resolve, 10_000));
  // Fetch the file from the API again
  file = await fileManager.getFile(uploadVideoResult.file.name);
}

if (file.state === FileState.FAILED) {
  throw new Error("Video processing failed.");
} else {
  process.stdout.write("\n");
}

const videoPart = {
  fileData: {
    fileUri: uploadVideoResult.file.uri,
    mimeType: uploadVideoResult.file.mimeType,
  },
};

const genAI = new GoogleGenerativeAI(process.env.API_KEY);
const model = genAI.getGenerativeModel({
  model: "gemini-1.5-flash",
});

const prompt = "Tell me about this video.";

// Call `countTokens` to get the input token count
// of the combined text and file (`totalTokens`).
// An video or audio file's display or file size does not affect its token count.
// Optionally, you can call `countTokens` for the text and file separately.
const countResult = await model.countTokens([prompt, videoPart]);

console.log(countResult.totalTokens); // 302

const generateResult = await model.generateContent([prompt, videoPart]);

// On the response for `generateContent`, use `usageMetadata`
// to get separate input and output token counts
// (`promptTokenCount` and `candidatesTokenCount`, respectively),
// as well as the combined token count (`totalTokenCount`).
console.log(generateResult.response.usageMetadata);
// { promptTokenCount: 302, candidatesTokenCount: 46, totalTokenCount: 348 }

Cache

Python

import time

model = genai.GenerativeModel("models/gemini-1.5-flash")

your_file = genai.upload_file(path=media / "a11.txt")

cache = genai.caching.CachedContent.create(
    model="models/gemini-1.5-flash-001",
    # You can set the system_instruction and tools
    system_instruction=None,
    tools=None,
    contents=["Here the Apollo 11 transcript:", your_file],
)

model = genai.GenerativeModel.from_cached_content(cache)

prompt = "Please give a short summary of this file."

# Call `count_tokens` to get input token count
# of the combined text and file (`total_tokens`).
# A video or audio file is converted to tokens at a fixed rate of tokens per second.
# Optionally, you can call `count_tokens` for the text and file separately.
print(model.count_tokens(prompt))
# ( total_tokens: 9 )

response = model.generate_content(prompt)

# On the response for `generate_content`, use `usage_metadata`
# to get separate input and output token counts
# (`prompt_token_count` and `candidates_token_count`, respectively),
# as well as the cached content token count and the combined total token count.
print(response.usage_metadata)
# ( prompt_token_count: 323393, cached_content_token_count: 323383, candidates_token_count: 64)
# ( total_token_count: 323457 )

cache.delete()

Node.js

// Make sure to include these imports:
// import { GoogleAIFileManager, GoogleAICacheManager } from "@google/generative-ai/server";
// import { GoogleGenerativeAI } from "@google/generative-ai";

// Upload large text file.
const fileManager = new GoogleAIFileManager(process.env.API_KEY);
const uploadResult = await fileManager.uploadFile(`${mediaPath}/a11.txt`, {
  mimeType: "text/plain",
});

// Create a cache that uses the uploaded file.
const cacheManager = new GoogleAICacheManager(process.env.API_KEY);
const cacheResult = await cacheManager.create({
  ttlSeconds: 600,
  model: "models/gemini-1.5-flash-001",
  contents: [
    {
      role: "user",
      parts: [{ text: "Here's the Apollo 11 transcript:" }],
    },
    {
      role: "user",
      parts: [
        {
          fileData: {
            fileUri: uploadResult.file.uri,
            mimeType: uploadResult.file.mimeType,
          },
        },
      ],
    },
  ],
});

const genAI = new GoogleGenerativeAI(process.env.API_KEY);
const model = genAI.getGenerativeModelFromCachedContent(cacheResult);

const prompt = "Please give a short summary of this file.";

// Call `countTokens` to get the input token count
// of the combined text and file (`totalTokens`).
const result = await model.countTokens(prompt);

console.log(result.totalTokens); // 10

const generateResult = await model.generateContent(prompt);

// On the response for `generateContent`, use `usageMetadata`
// to get separate input and output token counts
// (`promptTokenCount` and `candidatesTokenCount`, respectively),
// as well as the cached content token count and the combined total
// token count.
console.log(generateResult.response.usageMetadata);
// {
//   promptTokenCount: 323396,
//   candidatesTokenCount: 113,
//   totalTokenCount: 323509,
//   cachedContentTokenCount: 323386
// }

await cacheManager.delete(cacheResult.name);

System Instruction

Python

model = genai.GenerativeModel(model_name="gemini-1.5-flash")

prompt = "The quick brown fox jumps over the lazy dog."

print(model.count_tokens(prompt))
# total_tokens: 10

model = genai.GenerativeModel(
    model_name="gemini-1.5-flash", system_instruction="You are a cat. Your name is Neko."
)

# The total token count includes everything sent to the `generate_content` request.
# When you use system instructions, the total token count increases.
print(model.count_tokens(prompt))
# ( total_tokens: 21 )

Node.js

// Make sure to include these imports:
// import { GoogleGenerativeAI } from "@google/generative-ai";
const genAI = new GoogleGenerativeAI(process.env.API_KEY);
const model = genAI.getGenerativeModel({
  model: "models/gemini-1.5-flash",
  systemInstruction: "You are a cat. Your name is Neko.",
});

const result = await model.countTokens(
  "The quick brown fox jumps over the lazy dog.",
);

console.log(result);
// {
//   totalTokens: 23,
//   systemInstructionsTokens: { partTokens: [ 11 ], roleTokens: 1 },
//   contentTokens: [ { partTokens: [Array], roleTokens: 1 } ]
// }

Kotlin

val generativeModel =
    GenerativeModel(
        // Specify a Gemini model appropriate for your use case
        modelName = "gemini-1.5-flash",
        // Access your API key as a Build Configuration variable (see "Set up your API key" above)
        apiKey = BuildConfig.apiKey,
        systemInstruction = content(role = "system") { text("You are a cat. Your name is Neko.")}
    )

// For text-only input
val (totalTokens) = generativeModel.countTokens("What is your name?")
print(totalTokens)

Swift

let generativeModel =
  GenerativeModel(
    // Specify a model that supports system instructions, like a Gemini 1.5 model
    name: "gemini-1.5-flash",
    // Access your API key from your on-demand resource .plist file (see "Set up your API key"
    // above)
    apiKey: APIKey.default,
    systemInstruction: ModelContent(role: "system", parts: "You are a cat. Your name is Neko.")
  )

let prompt = "What is your name?"

let response = try await generativeModel.countTokens(prompt)
print("Total Tokens: \(response.totalTokens)")

Dart

var model = GenerativeModel(
  model: 'gemini-1.5-flash',
  apiKey: apiKey,
);
final prompt = 'The quick brown fox jumps over the lazy dog.';

// The total token count includes everything sent in the `generateContent`
// request.
var tokenCount = await model.countTokens([Content.text(prompt)]);
print('Total tokens: ${tokenCount.totalTokens}');
model = GenerativeModel(
  model: 'gemini-1.5-flash',
  apiKey: apiKey,
  systemInstruction: Content.system('You are a cat. Your name is Neko.'),
);
tokenCount = await model.countTokens([Content.text(prompt)]);
print('Total tokens: ${tokenCount.totalTokens}');

Java

// Create your system instructions
Content systemInstruction =
    new Content.Builder().addText("You are a cat. Your name is Neko.").build();

// Specify a Gemini model appropriate for your use case
GenerativeModel gm =
    new GenerativeModel(
        /* modelName */ "gemini-1.5-flash",
        // Access your API key as a Build Configuration variable (see "Set up your API key"
        // above)
        /* apiKey */ BuildConfig.apiKey,
        /* generationConfig (optional) */ null,
        /* safetySettings (optional) */ null,
        /* requestOptions (optional) */ new RequestOptions(),
        /* tools (optional) */ null,
        /* toolsConfig (optional) */ null,
        /* systemInstruction (optional) */ systemInstruction);
GenerativeModelFutures model = GenerativeModelFutures.from(gm);

Content inputContent = new Content.Builder().addText("What's your name?.").build();

// For illustrative purposes only. You should use an executor that fits your needs.
Executor executor = Executors.newSingleThreadExecutor();

// For text-only input
ListenableFuture<CountTokensResponse> countTokensResponse = model.countTokens(inputContent);

Futures.addCallback(
    countTokensResponse,
    new FutureCallback<CountTokensResponse>() {
      @Override
      public void onSuccess(CountTokensResponse result) {
        int totalTokens = result.getTotalTokens();
        System.out.println("TotalTokens = " + totalTokens);
      }

      @Override
      public void onFailure(Throwable t) {
        t.printStackTrace();
      }
    },
    executor);

Tools

Python

model = genai.GenerativeModel(model_name="gemini-1.5-flash")

prompt = "I have 57 cats, each owns 44 mittens, how many mittens is that in total?"

print(model.count_tokens(prompt))
# ( total_tokens: 22 )

def add(a: float, b: float):
    """returns a + b."""
    return a + b

def subtract(a: float, b: float):
    """returns a - b."""
    return a - b

def multiply(a: float, b: float):
    """returns a * b."""
    return a * b

def divide(a: float, b: float):
    """returns a / b."""
    return a / b

model = genai.GenerativeModel(
    "models/gemini-1.5-flash-001", tools=[add, subtract, multiply, divide]
)

# The total token count includes everything sent to the `generate_content` request.
# When you use tools (like function calling), the total token count increases.
print(model.count_tokens(prompt))
# ( total_tokens: 206 )

Node.js

// Make sure to include these imports:
// import { GoogleGenerativeAI } from "@google/generative-ai";
const genAI = new GoogleGenerativeAI(process.env.API_KEY);

const functionDeclarations = [
  { name: "add" },
  { name: "subtract" },
  { name: "multiply" },
  { name: "divide" },
];

const model = genAI.getGenerativeModel({
  model: "models/gemini-1.5-flash",
  tools: [{ functionDeclarations }],
});

const result = await model.countTokens(
  "I have 57 cats, each owns 44 mittens, how many mittens is that in total?",
);

console.log(result);
// {
//   totalTokens: 99,
//   systemInstructionsTokens: {},
//   contentTokens: [ { partTokens: [Array], roleTokens: 1 } ],
//   toolTokens: [ { functionDeclarationTokens: [Array] } ]
// }

Kotlin

val multiplyDefinition = defineFunction(
    name = "multiply",
    description = "returns the product of the provided numbers.",
    parameters = listOf(
        Schema.double("a", "First number"),
        Schema.double("b", "Second number")
    )
)
val usableFunctions = listOf(multiplyDefinition)

val generativeModel =
    GenerativeModel(
        // Specify a Gemini model appropriate for your use case
        modelName = "gemini-1.5-flash",
        // Access your API key as a Build Configuration variable (see "Set up your API key" above)
        apiKey = BuildConfig.apiKey,
        tools = listOf(Tool(usableFunctions))
    )

// For text-only input
val (totalTokens) = generativeModel.countTokens("What's the product of 9 and 358?")
print(totalTokens)

Swift

let generativeModel =
  GenerativeModel(
    // Specify a model that supports system instructions, like a Gemini 1.5 model
    name: "gemini-1.5-flash",
    // Access your API key from your on-demand resource .plist file (see "Set up your API key"
    // above)
    apiKey: APIKey.default,
    tools: [Tool(functionDeclarations: [
      FunctionDeclaration(
        name: "controlLight",
        description: "Set the brightness and color temperature of a room light.",
        parameters: [
          "brightness": Schema(
            type: .number,
            format: "double",
            description: "Light level from 0 to 100. Zero is off and 100 is full brightness."
          ),
          "colorTemperature": Schema(
            type: .string,
            format: "enum",
            description: "Color temperature of the light fixture.",
            enumValues: ["daylight", "cool", "warm"]
          ),
        ],
        requiredParameters: ["brightness", "colorTemperature"]
      ),
    ])]
  )

let prompt = "Dim the lights so the room feels cozy and warm."

let response = try await generativeModel.countTokens(prompt)
print("Total Tokens: \(response.totalTokens)")

Dart

var model = GenerativeModel(
  model: 'gemini-1.5-flash',
  apiKey: apiKey,
);
final prompt = 'I have 57 cats, each owns 44 mittens, '
    'how many mittens is that in total?';

// The total token count includes everything sent in the `generateContent`
// request.
var tokenCount = await model.countTokens([Content.text(prompt)]);
print('Total tokens: ${tokenCount.totalTokens}');
final binaryFunction = Schema.object(
  properties: {
    'a': Schema.number(nullable: false),
    'b': Schema.number(nullable: false)
  },
  requiredProperties: ['a', 'b'],
);

model = GenerativeModel(
  model: 'gemini-1.5-flash',
  apiKey: apiKey,
  tools: [
    Tool(functionDeclarations: [
      FunctionDeclaration('add', 'returns a + b', binaryFunction),
      FunctionDeclaration('subtract', 'returns a - b', binaryFunction),
      FunctionDeclaration('multipley', 'returns a * b', binaryFunction),
      FunctionDeclaration('divide', 'returns a / b', binaryFunction)
    ])
  ],
);
tokenCount = await model.countTokens([Content.text(prompt)]);
print('Total tokens: ${tokenCount.totalTokens}');

Java

FunctionDeclaration multiplyDefinition =
    defineFunction(
        /* name  */ "multiply",
        /* description */ "returns a * b.",
        /* parameters */ Arrays.asList(
            Schema.numDouble("a", "First parameter"),
            Schema.numDouble("b", "Second parameter")),
        /* required */ Arrays.asList("a", "b"));

Tool tool = new Tool(Arrays.asList(multiplyDefinition), null);
;

// Specify a Gemini model appropriate for your use case
GenerativeModel gm =
    new GenerativeModel(
        /* modelName */ "gemini-1.5-flash",
        // Access your API key as a Build Configuration variable (see "Set up your API key"
        // above)
        /* apiKey */ BuildConfig.apiKey,
        /* generationConfig (optional) */ null,
        /* safetySettings (optional) */ null,
        /* requestOptions (optional) */ new RequestOptions(),
        /* tools (optional) */ Arrays.asList(tool));
GenerativeModelFutures model = GenerativeModelFutures.from(gm);

Content inputContent = new Content.Builder().addText("What's your name?.").build();

// For illustrative purposes only. You should use an executor that fits your needs.
Executor executor = Executors.newSingleThreadExecutor();

// For text-only input
ListenableFuture<CountTokensResponse> countTokensResponse = model.countTokens(inputContent);

Futures.addCallback(
    countTokensResponse,
    new FutureCallback<CountTokensResponse>() {
      @Override
      public void onSuccess(CountTokensResponse result) {
        int totalTokens = result.getTotalTokens();
        System.out.println("TotalTokens = " + totalTokens);
      }

      @Override
      public void onFailure(Throwable t) {
        t.printStackTrace();
      }
    },
    executor);

Response body

A response from models.countTokens.

It returns the model's tokenCount for the prompt.

If successful, the response body contains data with the following structure:

Fields
totalTokens integer

The number of tokens that the model tokenizes the prompt into.

Always non-negative. When cachedContent is set, this is still the total effective prompt size. I.e. this includes the number of tokens in the cached content.

JSON representation
{
  "totalTokens": integer
}

GenerateContentRequest

Request to generate a completion from the model.

JSON representation
{
  "model": string,
  "contents": [
    {
      object (Content)
    }
  ],
  "tools": [
    {
      object (Tool)
    }
  ],
  "toolConfig": {
    object (ToolConfig)
  },
  "safetySettings": [
    {
      object (SafetySetting)
    }
  ],
  "systemInstruction": {
    object (Content)
  },
  "generationConfig": {
    object (GenerationConfig)
  },
  "cachedContent": string
}
Fields
model string

Required. The name of the Model to use for generating the completion.

Format: name=models/{model}.

contents[] object (Content)

Required. The content of the current conversation with the model.

For single-turn queries, this is a single instance. For multi-turn queries, this is a repeated field that contains conversation history + latest request.

tools[] object (Tool)

Optional. A list of Tools the model may use to generate the next response.

A Tool is a piece of code that enables the system to interact with external systems to perform an action, or set of actions, outside of knowledge and scope of the model. The only supported tool is currently Function.

toolConfig object (ToolConfig)

Optional. Tool configuration for any Tool specified in the request.

safetySettings[] object (SafetySetting)

Optional. A list of unique SafetySetting instances for blocking unsafe content.

This will be enforced on the GenerateContentRequest.contents and GenerateContentResponse.candidates. There should not be more than one setting for each SafetyCategory type. The API will block any contents and responses that fail to meet the thresholds set by these settings. This list overrides the default settings for each SafetyCategory specified in the safetySettings. If there is no SafetySetting for a given SafetyCategory provided in the list, the API will use the default safety setting for that category. Harm categories HARM_CATEGORY_HATE_SPEECH, HARM_CATEGORY_SEXUALLY_EXPLICIT, HARM_CATEGORY_DANGEROUS_CONTENT, HARM_CATEGORY_HARASSMENT are supported.

systemInstruction object (Content)

Optional. Developer set system instruction. Currently, text only.

generationConfig object (GenerationConfig)

Optional. Configuration options for model generation and outputs.

cachedContent string

Optional. The name of the cached content used as context to serve the prediction. Note: only used in explicit caching, where users can have control over caching (e.g. what content to cache) and enjoy guaranteed cost savings. Format: cachedContents/{cachedContent}