這是我們最具成本效益的多模態模型,可為高頻率的輕量型工作提供最快的效能。Gemini 3.1 Flash-Lite 最適合處理大量代理程式工作、簡單的資料擷取作業,以及預算和速度是主要限制的極低延遲應用程式。
gemini-3.1-flash-lite-preview
| 屬性 | 說明 |
|---|---|
| 模型代碼 | gemini-3.1-flash-lite-preview |
| 支援的資料類型 |
輸入裝置 文字、圖片、影片、音訊和 PDF 輸出內容 文字 |
| 代幣限制[*] |
輸入權杖限制 1,048,576 輸出詞元限制 65,536 |
| 功能 |
語音生成 不支援 批次 API 支援 快取 支援 程式碼執行 支援 電腦使用 不支援 檔案搜尋 支援 函式呼叫 支援 運用 Google 地圖建立基準 不支援 圖像生成 不支援 Live API 不支援 以 Google 搜尋為參考依據 支援 結構化輸出內容 支援 思考 支援 網址內容 支援 |
| 個版本 |
|
| 最新更新 | 2026 年 3 月 |
| 知識截點 | 2025 年 1 月 |
開發人員指南
Gemini 3.1 Flash-Lite 最適合處理大規模的簡單工作。以下是 Gemini 3.1 Flash-Lite 最適合的用途:
翻譯:快速、經濟實惠地翻譯大量內容,例如大規模處理即時通訊訊息、評論和支援服務單。你可以使用系統指令,將輸出內容限制為僅有譯文,不含額外註解:
text = "Hey, are you down to grab some pizza later? I'm starving!" response = client.models.generate_content( model="gemini-3.1-flash-lite-preview", config={ "system_instruction": "Only output the translated text" }, contents=f"Translate the following text to German: {text}" ) print(response.text)轉錄:處理錄音、語音記事或任何音訊內容,並取得文字轉錄稿,不必另外啟動語音轉文字管道。支援多模態輸入,因此您可以直接傳遞音訊檔案進行轉錄:
# URL = "https://storage.googleapis.com/generativeai-downloads/data/State_of_the_Union_Address_30_January_1961.mp3" # Upload the audio file to the GenAI File API uploaded_file = client.files.upload(file='sample.mp3') prompt = 'Generate a transcript of the audio.' response = client.models.generate_content( model="gemini-3.1-flash-lite-preview", contents=[prompt, uploaded_file] ) print(response.text)輕量型代理程式工作和資料擷取:支援實體擷取、分類和輕量型資料處理管道,並以結構化 JSON 輸出。舉例來說,從電子商務顧客評論中擷取結構化資料:
from pydantic import BaseModel, Field prompt = "Analyze the user review and determine the aspect, sentiment score, summary quote, and return risk" input_text = "The boots look amazing and the leather is high quality, but they run way too small. I'm sending them back." class ReviewAnalysis(BaseModel): aspect: str = Field(description="The feature mentioned (e.g., Price, Comfort, Style, Shipping)") summary_quote: str = Field(description="The specific phrase from the review about this aspect") sentiment_score: int = Field(description="1 to 5 (1=worst, 5=best)") is_return_risk: bool = Field(description="True if the user mentions returning the item") response = client.models.generate_content( model="gemini-3.1-flash-lite-preview", contents=[prompt, input_text], config={ "response_mime_type": "application/json", "response_json_schema": ReviewAnalysis.model_json_schema(), }, ) print(response.text)處理文件和摘要:剖析 PDF 並傳回簡潔摘要,例如建構文件處理管道或快速分類收到的檔案:
import httpx # Download a sample PDF document doc_url = "https://storage.googleapis.com/generativeai-downloads/data/med_gemini.pdf" doc_data = httpx.get(doc_url).content prompt = "Summarize this document" response = client.models.generate_content( model="gemini-3.1-flash-lite-preview", contents=[ types.Part.from_bytes( data=doc_data, mime_type='application/pdf', ), prompt ] ) print(response.text)模型路徑:使用低延遲且低成本的模型做為分類器,根據工作複雜度將查詢轉送至適當的模型。這是實際的生產模式,開放原始碼的 Gemini CLI 會使用 Flash-Lite 分類工作複雜度,並據此將工作導向 Flash 或 Pro。
FLASH_MODEL = 'flash' PRO_MODEL = 'pro' CLASSIFIER_SYSTEM_PROMPT = f""" You are a specialized Task Routing AI. Your sole function is to analyze the user's request and classify its complexity. Choose between `{FLASH_MODEL}` (SIMPLE) or `{PRO_MODEL}` (COMPLEX). 1. `{FLASH_MODEL}`: A fast, efficient model for simple, well-defined tasks. 2. `{PRO_MODEL}`: A powerful, advanced model for complex, open-ended, or multi-step tasks. A task is COMPLEX if it meets ONE OR MORE of the following criteria: 1. High Operational Complexity (Est. 4+ Steps/Tool Calls) 2. Strategic Planning and Conceptual Design 3. High Ambiguity or Large Scope 4. Deep Debugging and Root Cause Analysis A task is SIMPLE if it is highly specific, bounded, and has Low Operational Complexity (Est. 1-3 tool calls). """ user_input = "I'm getting an error 'Cannot read property 'map' of undefined' when I click the save button. Can you fix it?" response_schema = { "type": "object", "properties": { "reasoning": { "type": "string", "description": "A brief, step-by-step explanation for the model choice, referencing the rubric." }, "model_choice": { "type": "string", "enum": [FLASH_MODEL, PRO_MODEL] } }, "required": ["reasoning", "model_choice"] } response = client.models.generate_content( model="gemini-3.1-flash-lite-preview", contents=user_input, config={ "system_instruction": CLASSIFIER_SYSTEM_PROMPT, "response_mime_type": "application/json", "response_json_schema": response_schema }, ) print(response.text)思考:如要提高需要逐步推論的工作準確度,請設定思考,讓模型在產生最終輸出內容前,花費額外運算資源進行內部推論:
response = client.models.generate_content( model="gemini-3.1-flash-lite-preview", contents="How does AI work?", config=types.GenerateContentConfig( thinking_config=types.ThinkingConfig(thinking_level="high") ), ) print(response.text)