我们最具成本效益的多模态模型,可为高频轻量级任务提供最快的性能。Gemini 3.1 Flash-Lite 最适合处理海量代理任务、简单的数据提取任务,以及预算和速度是主要限制因素的极低延迟应用。
gemini-3.1-flash-lite-preview
| 属性 | 说明 |
|---|---|
| 模型代码 | gemini-3.1-flash-lite-preview |
| 支持的数据类型 |
输入源 文本、图片、视频、音频和 PDF 输出 文本 |
| 令牌限制[*] |
输入 token 限制 1,048,576 输出 token 限制 65536 |
| 功能 |
音频生成 不受支持 Batch API 支持 缓存 支持 代码执行 支持 计算机使用 不受支持 文件搜索 支持 函数调用 支持 依托 Google 地图进行接地 不受支持 图片生成 不受支持 Live API 不受支持 搜索接地 支持 结构化输出 支持 思考型 支持 网址上下文 支持 |
| 版本 |
|
| 最新更新 | 2026 年 3 月 |
| 知识截点 | 2025 年 1 月 |
开发者指南
Gemini 3.1 Flash-Lite 最擅长大规模处理简单任务。以下是一些最适合使用 Gemini 3.1 Flash-Lite 的应用场景:
翻译:快速、经济实惠的大规模翻译,例如大规模处理聊天消息、评价和支持请求。您可以使用系统指令将输出限制为仅包含翻译后的文本,而不包含额外的注释:
text = "Hey, are you down to grab some pizza later? I'm starving!" response = client.models.generate_content( model="gemini-3.1-flash-lite-preview", config={ "system_instruction": "Only output the translated text" }, contents=f"Translate the following text to German: {text}" ) print(response.text)转写:处理录音、语音记事或任何需要文本转写的音频内容,而无需启动单独的语音转文字流水线。支持多模态输入,因此您可以直接传递音频文件进行转写:
# URL = "https://storage.googleapis.com/generativeai-downloads/data/State_of_the_Union_Address_30_January_1961.mp3" # Upload the audio file to the GenAI File API uploaded_file = client.files.upload(file='sample.mp3') prompt = 'Generate a transcript of the audio.' response = client.models.generate_content( model="gemini-3.1-flash-lite-preview", contents=[prompt, uploaded_file] ) print(response.text)轻量级智能体任务和数据提取:支持实体提取、分类和轻量级数据处理流水线,并以结构化 JSON 格式输出。例如,从电子商务客户评价中提取结构化数据:
from pydantic import BaseModel, Field prompt = "Analyze the user review and determine the aspect, sentiment score, summary quote, and return risk" input_text = "The boots look amazing and the leather is high quality, but they run way too small. I'm sending them back." class ReviewAnalysis(BaseModel): aspect: str = Field(description="The feature mentioned (e.g., Price, Comfort, Style, Shipping)") summary_quote: str = Field(description="The specific phrase from the review about this aspect") sentiment_score: int = Field(description="1 to 5 (1=worst, 5=best)") is_return_risk: bool = Field(description="True if the user mentions returning the item") response = client.models.generate_content( model="gemini-3.1-flash-lite-preview", contents=[prompt, input_text], config={ "response_mime_type": "application/json", "response_json_schema": ReviewAnalysis.model_json_schema(), }, ) print(response.text)文档处理和总结:解析 PDF 并返回精炼的摘要,例如用于构建文档处理流水线或快速分诊传入的文件:
import httpx # Download a sample PDF document doc_url = "https://storage.googleapis.com/generativeai-downloads/data/med_gemini.pdf" doc_data = httpx.get(doc_url).content prompt = "Summarize this document" response = client.models.generate_content( model="gemini-3.1-flash-lite-preview", contents=[ types.Part.from_bytes( data=doc_data, mime_type='application/pdf', ), prompt ] ) print(response.text)模型路由:使用低延迟且低成本的模型作为分类器,根据任务复杂程度将查询路由到适当的模型。这是生产环境中的一种实际模式 - 开源 Gemini CLI 使用 Flash-Lite 对任务复杂程度进行分类,并相应地将任务路由到 Flash 或 Pro。
FLASH_MODEL = 'flash' PRO_MODEL = 'pro' CLASSIFIER_SYSTEM_PROMPT = f""" You are a specialized Task Routing AI. Your sole function is to analyze the user's request and classify its complexity. Choose between `{FLASH_MODEL}` (SIMPLE) or `{PRO_MODEL}` (COMPLEX). 1. `{FLASH_MODEL}`: A fast, efficient model for simple, well-defined tasks. 2. `{PRO_MODEL}`: A powerful, advanced model for complex, open-ended, or multi-step tasks. A task is COMPLEX if it meets ONE OR MORE of the following criteria: 1. High Operational Complexity (Est. 4+ Steps/Tool Calls) 2. Strategic Planning and Conceptual Design 3. High Ambiguity or Large Scope 4. Deep Debugging and Root Cause Analysis A task is SIMPLE if it is highly specific, bounded, and has Low Operational Complexity (Est. 1-3 tool calls). """ user_input = "I'm getting an error 'Cannot read property 'map' of undefined' when I click the save button. Can you fix it?" response_schema = { "type": "object", "properties": { "reasoning": { "type": "string", "description": "A brief, step-by-step explanation for the model choice, referencing the rubric." }, "model_choice": { "type": "string", "enum": [FLASH_MODEL, PRO_MODEL] } }, "required": ["reasoning", "model_choice"] } response = client.models.generate_content( model="gemini-3.1-flash-lite-preview", contents=user_input, config={ "system_instruction": CLASSIFIER_SYSTEM_PROMPT, "response_mime_type": "application/json", "response_json_schema": response_schema }, ) print(response.text)思考:对于需要逐步推理的任务,为了提高准确性,请配置思考,以便模型在生成最终输出之前花费额外的计算资源进行内部推理:
response = client.models.generate_content( model="gemini-3.1-flash-lite-preview", contents="How does AI work?", config=types.GenerateContentConfig( thinking_config=types.ThinkingConfig(thinking_level="high") ), ) print(response.text)