Gemini 3.1 Flash-Lite 預先發布版

這是我們最具成本效益的多模態模型,可為高頻率的輕量型工作提供最快的效能。Gemini 3.1 Flash-Lite 最適合處理大量代理程式工作、簡單的資料擷取作業,以及預算和速度是主要限制的極低延遲應用程式。

gemini-3.1-flash-lite-preview

屬性 說明
模型代碼 gemini-3.1-flash-lite-preview
支援的資料類型

輸入裝置

文字、圖片、影片、音訊和 PDF

輸出內容

文字

代幣限制[*]

輸入權杖限制

1,048,576

輸出詞元限制

65,536

功能

語音生成

不支援

批次 API

支援

快取

支援

程式碼執行

支援

電腦使用

不支援

檔案搜尋

支援

函式呼叫

支援

運用 Google 地圖建立基準

不支援

圖像生成

不支援

Live API

不支援

以 Google 搜尋為參考依據

支援

結構化輸出內容

支援

思考

支援

網址內容

支援

個版本
如要瞭解詳情,請參閱模型版本模式
  • Preview: gemini-3.1-flash-lite-preview
最新更新 2026 年 3 月
知識截點 2025 年 1 月

開發人員指南

Gemini 3.1 Flash-Lite 最適合處理大規模的簡單工作。以下是 Gemini 3.1 Flash-Lite 最適合的用途:

  • 翻譯:快速、經濟實惠地翻譯大量內容,例如大規模處理即時通訊訊息、評論和支援服務單。你可以使用系統指令,將輸出內容限制為僅有譯文,不含額外註解:

    text = "Hey, are you down to grab some pizza later? I'm starving!"
    
    response = client.models.generate_content(
        model="gemini-3.1-flash-lite-preview",
        config={
            "system_instruction": "Only output the translated text"
        },
        contents=f"Translate the following text to German: {text}"
    )
    
    print(response.text)
    
  • 轉錄:處理錄音、語音記事或任何音訊內容,並取得文字轉錄稿,不必另外啟動語音轉文字管道。支援多模態輸入,因此您可以直接傳遞音訊檔案進行轉錄:

    # URL = "https://storage.googleapis.com/generativeai-downloads/data/State_of_the_Union_Address_30_January_1961.mp3"
    
    # Upload the audio file to the GenAI File API
    uploaded_file = client.files.upload(file='sample.mp3')
    
    prompt = 'Generate a transcript of the audio.'
    
    response = client.models.generate_content(
      model="gemini-3.1-flash-lite-preview",
      contents=[prompt, uploaded_file]
    )
    
    print(response.text)
    
  • 輕量型代理程式工作和資料擷取:支援實體擷取、分類和輕量型資料處理管道,並以結構化 JSON 輸出。舉例來說,從電子商務顧客評論中擷取結構化資料:

    from pydantic import BaseModel, Field
    
    prompt = "Analyze the user review and determine the aspect, sentiment score, summary quote, and return risk"
    input_text = "The boots look amazing and the leather is high quality, but they run way too small. I'm sending them back."
    
    class ReviewAnalysis(BaseModel):
        aspect: str = Field(description="The feature mentioned (e.g., Price, Comfort, Style, Shipping)")
        summary_quote: str = Field(description="The specific phrase from the review about this aspect")
        sentiment_score: int = Field(description="1 to 5 (1=worst, 5=best)")
        is_return_risk: bool = Field(description="True if the user mentions returning the item")
    
    response = client.models.generate_content(
        model="gemini-3.1-flash-lite-preview",
        contents=[prompt, input_text],
        config={
            "response_mime_type": "application/json",
            "response_json_schema": ReviewAnalysis.model_json_schema(),
        },
    )
    
    print(response.text)
    
  • 處理文件和摘要:剖析 PDF 並傳回簡潔摘要,例如建構文件處理管道或快速分類收到的檔案:

    import httpx
    
    # Download a sample PDF document
    doc_url = "https://storage.googleapis.com/generativeai-downloads/data/med_gemini.pdf"
    doc_data = httpx.get(doc_url).content
    
    prompt = "Summarize this document"
    response = client.models.generate_content(
        model="gemini-3.1-flash-lite-preview",
        contents=[
            types.Part.from_bytes(
                data=doc_data,
                mime_type='application/pdf',
            ),
            prompt
        ]
    )
    
    print(response.text)
    
  • 模型路徑:使用低延遲且低成本的模型做為分類器,根據工作複雜度將查詢轉送至適當的模型。這是實際的生產模式,開放原始碼的 Gemini CLI 會使用 Flash-Lite 分類工作複雜度,並據此將工作導向 Flash 或 Pro。

    FLASH_MODEL = 'flash'
    PRO_MODEL = 'pro'
    
    CLASSIFIER_SYSTEM_PROMPT = f"""
    You are a specialized Task Routing AI. Your sole function is to analyze the user's request and classify its complexity. Choose between `{FLASH_MODEL}` (SIMPLE) or `{PRO_MODEL}` (COMPLEX).
    1.  `{FLASH_MODEL}`: A fast, efficient model for simple, well-defined tasks.
    2.  `{PRO_MODEL}`: A powerful, advanced model for complex, open-ended, or multi-step tasks.
    
    A task is COMPLEX if it meets ONE OR MORE of the following criteria:
    1.  High Operational Complexity (Est. 4+ Steps/Tool Calls)
    2.  Strategic Planning and Conceptual Design
    3.  High Ambiguity or Large Scope
    4.  Deep Debugging and Root Cause Analysis
    
    A task is SIMPLE if it is highly specific, bounded, and has Low Operational Complexity (Est. 1-3 tool calls).
    """
    
    user_input = "I'm getting an error 'Cannot read property 'map' of undefined' when I click the save button. Can you fix it?"
    
    response_schema = {
      "type": "object",
      "properties": {
        "reasoning": {
          "type": "string",
          "description": "A brief, step-by-step explanation for the model choice, referencing the rubric."
        },
        "model_choice": {
          "type": "string",
          "enum": [FLASH_MODEL, PRO_MODEL]
        }
      },
      "required": ["reasoning", "model_choice"]
    }
    
    response = client.models.generate_content(
        model="gemini-3.1-flash-lite-preview",
        contents=user_input,
        config={
            "system_instruction": CLASSIFIER_SYSTEM_PROMPT,
            "response_mime_type": "application/json",
            "response_json_schema": response_schema
        },
    )
    
    print(response.text)
    
  • 思考:如要提高需要逐步推論的工作準確度,請設定思考,讓模型在產生最終輸出內容前,花費額外運算資源進行內部推論:

    response = client.models.generate_content(
        model="gemini-3.1-flash-lite-preview",
        contents="How does AI work?",
        config=types.GenerateContentConfig(
            thinking_config=types.ThinkingConfig(thinking_level="high")
        ),
    )
    
    print(response.text)