Interactions API の一般提供を開始しました。この API を使用して、最新の機能とモデルにアクセスすることをおすすめします。

Google uses AI technology to translate content into your preferred language. AI translations can contain errors.

Gemini 3.1 Flash-Lite

Gemini 3.1 Flash-Lite は、低レイテンシで費用対効果に優れたマルチモーダルモデルで、頻度の高い軽量タスク向けに最適化されています。このモデルは、テキスト、画像、動画、音声、PDF の入力をサポートしており、大量のエージェントワークフロー、シンプルなデータ抽出、レイテンシと API コストが主な制約となるアプリケーション向けに設計されています。

Google AI Studio で試す

gemini-3.1-flash-lite

プロパティ	説明
モデルコード	`gemini-3.1-flash-lite`
サポートされるデータタイプ	入力テキスト、画像、動画、音声、PDF 出力テキスト
トークン上限^[*]	入力トークンの上限 1,048,576 出力トークンの上限 65,536
機能	音声生成サポート対象外キャッシュサポート対象コード実行サポート対象パソコンの使用サポート対象外ファイル検索サポート対象関数呼び出しサポート対象 Google マップによるグラウンディングサポート対象画像生成サポート対象外 Live API サポート対象外検索によるグラウンディングサポート対象構造化出力サポート対象思考サポート対象 URL コンテキストサポート対象
使用オプション	Batch API サポート対象 Flex 推論サポート対象優先推論サポート対象
バージョン	詳細については、モデルバージョンのパターンをご覧ください。 `Stable: gemini-3.1-flash-lite`
最終更新日	2026 年 5 月
ナレッジカットオフ	2025 年 1 月

デベロッパーガイド

Gemini 3.1 Flash-Lite は、大規模な単純なタスクの処理に最適です。Gemini 3.1 Flash-Lite に最適なユースケースをいくつかご紹介します。

翻訳: チャットメッセージ、レビュー、サポートチケットなどの大規模な処理など、高速で安価な大量翻訳。システム命令を使用して、出力が翻訳されたテキストのみに制限され、追加の解説がないようにできます。

from google import genai

client = genai.Client()
text = "Hey, are you down to grab some pizza later? I'm starving!"

response = client.models.generate_content(
    model="gemini-3.1-flash-lite",
    config={
        "system_instruction": "Only output the translated text"
    },
    contents=f"Translate the following text to German: {text}"
)

print(response.text)

文字起こし: 別の音声テキスト変換パイプラインをスピンアップせずに、テキストの文字起こしが必要な録音、音声メモ、音声コンテンツを処理します。マルチモーダル入力をサポートしているため、音声ファイルを直接渡して文字起こしを行うことができます。

from google import genai

client = genai.Client()

# URL = "https://storage.googleapis.com/generativeai-downloads/data/State_of_the_Union_Address_30_January_1961.mp3"
# Upload the audio file to the GenAI File API
uploaded_file = client.files.upload(file='sample.mp3')

prompt = 'Generate a transcript of the audio.'

response = client.models.generate_content(
  model="gemini-3.1-flash-lite",
  contents=[prompt, uploaded_file]
)

print(response.text)

軽量なエージェントタスクとデータ抽出: エンティティ抽出、分類、構造化 JSON 出力でサポートされる軽量データ処理パイプライン。たとえば、e コマースの購入者レビューから構造化データを抽出します。

from google import genai
from pydantic import BaseModel, Field

client = genai.Client()

prompt = "Analyze the user review and determine the aspect, sentiment score, summary quote, and return risk"
input_text = "The boots look amazing and the leather is high quality, but they run way too small. I'm sending them back."

class ReviewAnalysis(BaseModel):
    aspect: str = Field(description="The feature mentioned (e.g., Price, Comfort, Style, Shipping)")
    summary_quote: str = Field(description="The specific phrase from the review about this aspect")
    sentiment_score: int = Field(description="1 to 5 (1=worst, 5=best)")
    is_return_risk: bool = Field(description="True if the user mentions returning the item")

response = client.models.generate_content(
    model="gemini-3.1-flash-lite",
    contents=[prompt, input_text],
    config={
        "response_mime_type": "application/json",
        "response_json_schema": ReviewAnalysis.model_json_schema(),
    },
)

print(response.text)

ドキュメントの処理と要約: PDF を解析して、ドキュメント処理パイプラインの構築や、受信ファイルの迅速なトリアージなど、簡潔な要約を返します。

from google import genai
from google.genai import types
import httpx

client = genai.Client()

# Download a sample PDF document
doc_url = "https://storage.googleapis.com/generativeai-downloads/data/med_gemini.pdf"
doc_data = httpx.get(doc_url).content

prompt = "Summarize this document"
response = client.models.generate_content(
    model="gemini-3.1-flash-lite",
    contents=[
        types.Part.from_bytes(
            data=doc_data,
            mime_type='application/pdf',
        ),
        prompt
    ]
)

print(response.text)

モデルルーティング: 低レイテンシで低コストのモデルを分類子として使用し、タスクの複雑さに基づいてクエリを適切なモデルにルーティングします。これは実際のパターンです。オープンソースのGemini CLIは、Flash-Lite を使用してタスクの複雑さを分類し、それに応じて Flash または Pro にルーティングします。

from google import genai

client = genai.Client()

FLASH_MODEL = 'flash'
PRO_MODEL = 'pro'

CLASSIFIER_SYSTEM_PROMPT = f"""
You are a specialized Task Routing AI. Your sole function is to analyze the user's request and classify its complexity. Choose between `{FLASH_MODEL}` (SIMPLE) or `{PRO_MODEL}` (COMPLEX).
1.  `{FLASH_MODEL}`: A fast, efficient model for simple, well-defined tasks.
2.  `{PRO_MODEL}`: A powerful, advanced model for complex, open-ended, or multi-step tasks.

A task is COMPLEX if it meets ONE OR MORE of the following criteria:
1.  High Operational Complexity (Est. 4+ Steps/Tool Calls)
2.  Strategic Planning and Conceptual Design
3.  High Ambiguity or Large Scope
4.  Deep Debugging and Root Cause Analysis

A task is SIMPLE if it is highly specific, bounded, and has Low Operational Complexity (Est. 1-3 tool calls).
"""

user_input = "I'm getting an error 'Cannot read property 'map' of undefined' when I click the save button. Can you fix it?"

response_schema = {
  "type": "object",
  "properties": {
    "reasoning": {
      "type": "string",
      "description": "A brief, step-by-step explanation for the model choice, referencing the rubric."
    },
    "model_choice": {
      "type": "string",
      "enum": [FLASH_MODEL, PRO_MODEL]
    }
  },
  "required": ["reasoning", "model_choice"]
}

response = client.models.generate_content(
    model="gemini-3.1-flash-lite",
    contents=user_input,
    config={
        "system_instruction": CLASSIFIER_SYSTEM_PROMPT,
        "response_mime_type": "application/json",
        "response_json_schema": response_schema
    },
)

print(response.text)

思考: ステップごとの推論が役立つタスクの精度を高めるには、最終的な出力を生成する前に、モデルが内部推論に追加のコンピューティングを使用するように思考を構成します。

from google import genai
from google.genai import types

client = genai.Client()

response = client.models.generate_content(
    model="gemini-3.1-flash-lite",
    contents="How does AI work?",
    config=types.GenerateContentConfig(
        thinking_config=types.ThinkingConfig(thinking_level="high")
    ),
)

print(response.text)

Gemini 3.1 Flash-Lite

gemini-3.1-flash-lite

デベロッパー ガイド

デベロッパーガイド