API تعاملات اکنون به طور عمومی در دسترس است. توصیه می‌کنیم برای دسترسی به جدیدترین ویژگی‌ها و مدل‌ها از این API استفاده کنید.

این صفحه به‌وسیله ‏Cloud Translation API‏ ترجمه شده است.

توکن ها را بفهمید و بشمارید

Gemini و دیگر مدل‌های هوش مصنوعی مولد، ورودی و خروجی را با جزئیاتی به نام توکن پردازش می‌کنند.

برای مدل‌های Gemini، یک توکن معادل حدود ۴ کاراکتر است. ۱۰۰ توکن معادل حدود ۶۰ تا ۸۰ کلمه انگلیسی است.

درباره توکن‌ها

توکن‌ها می‌توانند کاراکترهای تکی مانند z یا کلمات کاملی مانند cat باشند. کلمات طولانی به چندین توکن تقسیم می‌شوند. مجموعه تمام توکن‌های مورد استفاده توسط مدل، واژگان نامیده می‌شود و فرآیند تقسیم متن به توکن‌ها، توکن‌سازی نامیده می‌شود.

وقتی صورتحساب فعال باشد، هزینه تماس با API جمینی تا حدودی توسط تعداد توکن‌های ورودی و خروجی تعیین می‌شود، بنابراین دانستن نحوه شمارش توکن‌ها می‌تواند مفید باشد.

تعداد توکن‌ها

تمام ورودی‌ها و خروجی‌های API جمینی، از جمله متن، فایل‌های تصویری و سایر موارد غیرمتنی، توکن‌سازی شده‌اند.

شما می‌توانید توکن‌ها را به روش‌های زیر بشمارید:

تابع count_tokens را با ورودی درخواست فراخوانی کنید. فقط تعداد کل توکن‌های موجود در ورودی را برمی‌گرداند. این فراخوانی را قبل از ارسال ورودی انجام دهید تا اندازه درخواست‌های شما بررسی شود.
از میزان usage در پاسخ تعامل استفاده کنید. تعداد توکن‌ها را برای ورودی ( total_input_tokens )، خروجی ( total_output_tokens )، تفکر ( total_thought_tokens )، محتوای ذخیره شده ( total_cached_tokens )، استفاده از ابزار ( total_tool_use_tokens ) و کل ( total_tokens ) برمی‌گرداند.

شمارش توکن‌های متنی

پایتون

# This will only work for SDK newer than 2.0.0
from google import genai

client = genai.Client()
prompt = "The quick brown fox jumps over the lazy dog."

# Count tokens before sending
total_tokens = client.models.count_tokens(
    model="gemini-3.5-flash",
    contents=prompt
)
print("total_tokens:", total_tokens.total_tokens)

# Get usage from interaction
interaction = client.interactions.create(
    model="gemini-3.5-flash",
    input=prompt
)
print(interaction.usage)

جاوا اسکریپت

// This will only work for SDK newer than 2.0.0
import { GoogleGenAI } from '@google/genai';

const client = new GoogleGenAI({});
const prompt = "The quick brown fox jumps over the lazy dog.";

// Count tokens before sending
const countResponse = await client.models.countTokens({
    model: "gemini-3.5-flash",
    contents: prompt,
});
console.log(countResponse.totalTokens);

// Get usage from interaction
const interaction = await client.interactions.create({
    model: "gemini-3.5-flash",
    input: prompt,
});
console.log(interaction.usage);

استراحت

# Specifies the API revision to avoid breaking changes when they become default
curl -X POST "https://generativelanguage.googleapis.com/v1beta/models/gemini-3.5-flash:countTokens" \
  -H "x-goog-api-key: $GEMINI_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"contents": [{"parts": [{"text": "The quick brown fox."}]}]}'

شمارش توکن‌های چند نوبتی

شمارش توکن‌ها در طول تاریخچه مکالمه با استفاده از previous_interaction_id :

پایتون

# This will only work for SDK newer than 2.0.0
# First interaction
interaction1 = client.interactions.create(
    model="gemini-3.5-flash",
    input="Hi, my name is Bob"
)

# Second interaction continues the conversation
interaction2 = client.interactions.create(
    model="gemini-3.5-flash",
    input="What's my name?",
    previous_interaction_id=interaction1.id
)

# Usage includes tokens from both turns
print(f"Input tokens: {interaction2.usage.total_input_tokens}")
print(f"Output tokens: {interaction2.usage.total_output_tokens}")
print(f"Total tokens: {interaction2.usage.total_tokens}")

جاوا اسکریپت

// This will only work for SDK newer than 2.0.0
// First interaction
const interaction1 = await client.interactions.create({
    model: "gemini-3.5-flash",
    input: "Hi, my name is Bob"
});

// Second interaction continues the conversation
const interaction2 = await client.interactions.create({
    model: "gemini-3.5-flash",
    input: "What's my name?",
    previous_interaction_id: interaction1.id
});

console.log(`Input tokens: ${interaction2.usage.total_input_tokens}`);
console.log(`Output tokens: ${interaction2.usage.total_output_tokens}`);

شمارش توکن‌های چندوجهی

تمام ورودی‌های رابط برنامه‌نویسی کاربردی (API) جمینی، از جمله تصاویر، ویدیو و صدا، توکنیزه می‌شوند. نکات کلیدی در مورد توکنیزه کردن:

تصاویر : تصاویری که در هر دو بعد ≤۳۸۴ پیکسل باشند، ۲۵۸ توکن محسوب می‌شوند. تصاویر بزرگتر به صورت کاشی‌هایی با ابعاد ۷۶۸x۷۶۸ پیکسل قرار می‌گیرند که هر کدام ۲۵۸ توکن محسوب می‌شوند.
ویدیو : ۲۶۳ توکن در ثانیه
صدا : ۳۲ توکن در ثانیه

توکن‌های تصویر

پایتون

# This will only work for SDK newer than 2.0.0
uploaded_file = client.files.upload(file="path/to/image.jpg")

# Count tokens for image + text
total_tokens = client.models.count_tokens(
    model="gemini-3.5-flash",
    contents=["Tell me about this image", uploaded_file]
)
print(f"Total tokens: {total_tokens}")

# Generate with image
interaction = client.interactions.create(
    model="gemini-3.5-flash",
    input=[
        {"type": "text", "text": "Tell me about this image"},
        {"type": "image", "uri": uploaded_file.uri, "mime_type": uploaded_file.mime_type}
    ]
)
print(interaction.usage)

جاوا اسکریپت

// This will only work for SDK newer than 2.0.0
const uploadedFile = await client.files.upload({
    file: "path/to/image.jpg",
    config: { mimeType: "image/jpeg" }
});

// Count tokens
const countResponse = await client.models.countTokens({
    model: "gemini-3.5-flash",
    contents: [
        { text: "Tell me about this image" },
        { fileData: { fileUri: uploadedFile.uri, mimeType: uploadedFile.mimeType } }
    ]
});
console.log(countResponse.totalTokens);

مثال داده درون خطی:

پایتون

# This will only work for SDK newer than 2.0.0
import base64

with open('image.jpg', 'rb') as f:
    image_bytes = f.read()

interaction = client.interactions.create(
    model="gemini-3.5-flash",
    input=[
        {"type": "text", "text": "Describe this image"},
        {
            "type": "image",
            "data": base64.b64encode(image_bytes).decode('utf-8'),
            "mime_type": "image/jpeg"
        }
    ]
)
print(interaction.usage)

توکن‌های ویدیویی

پایتون

# This will only work for SDK newer than 2.0.0
import time

video_file = client.files.upload(file="path/to/video.mp4")

while not video_file.state or video_file.state.name != "ACTIVE":
    print("Processing video...")
    time.sleep(5)
    video_file = client.files.get(name=video_file.name)

# A 60-second video is approximately 263 * 60 = 15,780 tokens
total_tokens = client.models.count_tokens(
    model="gemini-3.5-flash",
    contents=["Summarize this video", video_file]
)
print(f"Total tokens: {total_tokens}")

# Generate with video
interaction = client.interactions.create(
    model="gemini-3.5-flash",
    input=[
        {"type": "text", "text": "Summarize this video"},
        {"type": "video", "uri": video_file.uri, "mime_type": video_file.mime_type}
    ]
)
print(interaction.usage)

توکن‌های صوتی

پایتون

# This will only work for SDK newer than 2.0.0
audio_file = client.files.upload(file="path/to/audio.mp3")

# A 60-second audio clip is approximately 32 * 60 = 1,920 tokens
total_tokens = client.models.count_tokens(
    model="gemini-3.5-flash",
    contents=["Transcribe this audio", audio_file]
)
print(f"Total tokens: {total_tokens}")

# Generate with audio
interaction = client.interactions.create(
    model="gemini-3.5-flash",
    input=[
        {"type": "text", "text": "Transcribe this audio"},
        {"type": "audio", "uri": audio_file.uri, "mime_type": audio_file.mime_type}
    ]
)
print(interaction.usage)

شمارش توکن‌های دستورالعمل سیستم

دستورالعمل‌های سیستم به عنوان بخشی از توکن‌های ورودی شمرده می‌شوند:

پایتون

# This will only work for SDK newer than 2.0.0
interaction = client.interactions.create(
    model="gemini-3.5-flash",
    input="Hello!",
    system_instruction="You are a helpful assistant who speaks like a pirate."
)

# system_instruction tokens included in total_input_tokens
print(f"Input tokens: {interaction.usage.total_input_tokens}")

شمارش توکن‌های ابزار

ابزارها (توابع، اجرای کد، جستجوی گوگل) نیز شمارش می‌شوند:

پایتون

# This will only work for SDK newer than 2.0.0
tools = [
    {
        "type": "function",
        "name": "get_weather",
        "description": "Get current weather",
        "parameters": {
            "type": "object",
            "properties": {
                "location": {"type": "string"}
            }
        }
    }
]

interaction = client.interactions.create(
    model="gemini-3.5-flash",
    input="What's the weather in Tokyo?",
    tools=tools
)

print(f"Input tokens: {interaction.usage.total_input_tokens}")
print(f"Tool use tokens: {interaction.usage.total_tool_use_tokens}")

پنجره زمینه

هر مدل Gemini حداکثر تعداد توکن‌هایی را که می‌تواند مدیریت کند، دارد. پنجره context محدودیت ترکیبی توکن‌های ورودی و خروجی را تعریف می‌کند.

اندازه پنجره زمینه را به صورت برنامه‌ای دریافت کنید

پایتون

# This will only work for SDK newer than 2.0.0
model_info = client.models.get(model="gemini-3.5-flash")
print(f"Input token limit: {model_info.input_token_limit}")
print(f"Output token limit: {model_info.output_token_limit}")

جاوا اسکریپت

// This will only work for SDK newer than 2.0.0
const modelInfo = await client.models.get({ model: "gemini-3.5-flash" });
console.log(`Input token limit: ${modelInfo.inputTokenLimit}`);
console.log(`Output token limit: ${modelInfo.outputTokenLimit}`);

اندازه‌های پنجره‌های زمینه را در صفحه مدل‌ها پیدا کنید.

قدم بعدی چیست؟

تولید متن : اصول اولیه تولید
ذخیره سازی : کاهش هزینه ها با ذخیره سازی
قیمت‌گذاری : هزینه‌ها را درک کنید