Gemini 3.1 Flash-Lite 是一款低延迟、经济高效的多模态模型, 经过优化,可处理高频轻量级任务。该模型支持文本、图片、视频、音频和 PDF 输入,专为高调用量智能体工作流、简单的数据提取以及延迟和 API 费用是主要限制因素的应用而设计。
gemini-3.1-flash-lite
| 属性 | 说明 |
|---|---|
| 模型代码 | gemini-3.1-flash-lite |
| 支持的数据类型 |
输入源 文本、图片、视频、音频和 PDF 输出 文字 |
| 令牌限制[*] |
输入 token 限制 1,048,576 输出 token 限制 65536 |
| 功能 |
不受支持 支持 支持 不受支持 支持 支持 支持 不受支持 不受支持 支持 支持 支持 支持 |
| 使用选项 |
支持 支持 支持 |
| 版本 |
|
| 最新更新 | 2026 年 5 月 |
| 知识截点 | 2025 年 1 月 |
开发者指南
Gemini 3.1 Flash-Lite 最擅长大规模处理简单任务。以下是一些最适合使用 Gemini 3.1 Flash-Lite 的应用场景:
翻译:快速、经济实惠的大规模翻译,例如大规模处理聊天消息、评价和支持请求。您可以使用系统指令将输出限制为仅包含翻译后的文本,而不包含额外的注释:
from google import genai client = genai.Client() text = "Hey, are you down to grab some pizza later? I'm starving!" response = client.models.generate_content( model="gemini-3.1-flash-lite", config={ "system_instruction": "Only output the translated text" }, contents=f"Translate the following text to German: {text}" ) print(response.text)转写:处理录音、语音记事或任何需要文本转写的音频内容,而无需启动单独的语音转文本流水线。支持多模态输入,因此您可以直接传递音频文件进行转写:
from google import genai client = genai.Client() # URL = "https://storage.googleapis.com/generativeai-downloads/data/State_of_the_Union_Address_30_January_1961.mp3" # Upload the audio file to the GenAI File API uploaded_file = client.files.upload(file='sample.mp3') prompt = 'Generate a transcript of the audio.' response = client.models.generate_content( model="gemini-3.1-flash-lite", contents=[prompt, uploaded_file] ) print(response.text)轻量级智能体任务和数据提取:支持实体提取、分类和轻量级数据处理流水线,并提供结构化 JSON 输出。例如,从电子商务客户评价中提取结构化数据:
from google import genai from pydantic import BaseModel, Field client = genai.Client() prompt = "Analyze the user review and determine the aspect, sentiment score, summary quote, and return risk" input_text = "The boots look amazing and the leather is high quality, but they run way too small. I'm sending them back." class ReviewAnalysis(BaseModel): aspect: str = Field(description="The feature mentioned (e.g., Price, Comfort, Style, Shipping)") summary_quote: str = Field(description="The specific phrase from the review about this aspect") sentiment_score: int = Field(description="1 to 5 (1=worst, 5=best)") is_return_risk: bool = Field(description="True if the user mentions returning the item") response = client.models.generate_content( model="gemini-3.1-flash-lite", contents=[prompt, input_text], config={ "response_mime_type": "application/json", "response_json_schema": ReviewAnalysis.model_json_schema(), }, ) print(response.text)文档处理和总结:解析 PDF 并返回精炼的摘要,例如用于构建文档处理流水线或快速分诊传入的文件:
from google import genai from google.genai import types import httpx client = genai.Client() # Download a sample PDF document doc_url = "https://storage.googleapis.com/generativeai-downloads/data/med_gemini.pdf" doc_data = httpx.get(doc_url).content prompt = "Summarize this document" response = client.models.generate_content( model="gemini-3.1-flash-lite", contents=[ types.Part.from_bytes( data=doc_data, mime_type='application/pdf', ), prompt ] ) print(response.text)模型路由:使用低延迟且低成本的模型作为分类器,根据任务复杂程度将查询路由到适当的模型。这是生产环境中的一种实际模式 - 开源 Gemini CLI 使用 Flash-Lite 对任务复杂程度进行分类,并相应地将任务路由到 Flash 或 Pro。
from google import genai client = genai.Client() FLASH_MODEL = 'flash' PRO_MODEL = 'pro' CLASSIFIER_SYSTEM_PROMPT = f""" You are a specialized Task Routing AI. Your sole function is to analyze the user's request and classify its complexity. Choose between `{FLASH_MODEL}` (SIMPLE) or `{PRO_MODEL}` (COMPLEX). 1. `{FLASH_MODEL}`: A fast, efficient model for simple, well-defined tasks. 2. `{PRO_MODEL}`: A powerful, advanced model for complex, open-ended, or multi-step tasks. A task is COMPLEX if it meets ONE OR MORE of the following criteria: 1. High Operational Complexity (Est. 4+ Steps/Tool Calls) 2. Strategic Planning and Conceptual Design 3. High Ambiguity or Large Scope 4. Deep Debugging and Root Cause Analysis A task is SIMPLE if it is highly specific, bounded, and has Low Operational Complexity (Est. 1-3 tool calls). """ user_input = "I'm getting an error 'Cannot read property 'map' of undefined' when I click the save button. Can you fix it?" response_schema = { "type": "object", "properties": { "reasoning": { "type": "string", "description": "A brief, step-by-step explanation for the model choice, referencing the rubric." }, "model_choice": { "type": "string", "enum": [FLASH_MODEL, PRO_MODEL] } }, "required": ["reasoning", "model_choice"] } response = client.models.generate_content( model="gemini-3.1-flash-lite", contents=user_input, config={ "system_instruction": CLASSIFIER_SYSTEM_PROMPT, "response_mime_type": "application/json", "response_json_schema": response_schema }, ) print(response.text)思考:对于需要逐步推理的任务,为了提高准确性,请配置思考,以便模型在生成最终输出之前花费额外的计算资源进行内部推理:
from google import genai from google.genai import types client = genai.Client() response = client.models.generate_content( model="gemini-3.1-flash-lite", contents="How does AI work?", config=types.GenerateContentConfig( thinking_config=types.ThinkingConfig(thinking_level="high") ), ) print(response.text)