Gemini 3.1 Flash-Lite 预览版

我们最具成本效益的多模态模型,可为高频轻量级任务提供最快的性能。Gemini 3.1 Flash-Lite 最适合处理海量代理任务、简单的数据提取任务,以及预算和速度是主要限制因素的极低延迟应用。

gemini-3.1-flash-lite-preview

属性 说明
模型代码 gemini-3.1-flash-lite-preview
支持的数据类型

输入源

文本、图片、视频、音频和 PDF

输出

文本

令牌限制[*]

输入 token 限制

1,048,576

输出 token 限制

65536

功能

音频生成

不受支持

Batch API

支持

缓存

支持

代码执行

支持

计算机使用

不受支持

文件搜索

支持

函数调用

支持

依托 Google 地图进行接地

不受支持

图片生成

不受支持

Live API

不受支持

搜索接地

支持

结构化输出

支持

思考型

支持

网址上下文

支持

版本
如需了解详情,请参阅模型版本模式
  • Preview: gemini-3.1-flash-lite-preview
最新更新 2026 年 3 月
知识截点 2025 年 1 月

开发者指南

Gemini 3.1 Flash-Lite 最擅长大规模处理简单任务。以下是一些最适合使用 Gemini 3.1 Flash-Lite 的应用场景:

  • 翻译:快速、经济实惠的大规模翻译,例如大规模处理聊天消息、评价和支持请求。您可以使用系统指令将输出限制为仅包含翻译后的文本,而不包含额外的注释:

    text = "Hey, are you down to grab some pizza later? I'm starving!"
    
    response = client.models.generate_content(
        model="gemini-3.1-flash-lite-preview",
        config={
            "system_instruction": "Only output the translated text"
        },
        contents=f"Translate the following text to German: {text}"
    )
    
    print(response.text)
    
  • 转写:处理录音、语音记事或任何需要文本转写的音频内容,而无需启动单独的语音转文字流水线。支持多模态输入,因此您可以直接传递音频文件进行转写:

    # URL = "https://storage.googleapis.com/generativeai-downloads/data/State_of_the_Union_Address_30_January_1961.mp3"
    
    # Upload the audio file to the GenAI File API
    uploaded_file = client.files.upload(file='sample.mp3')
    
    prompt = 'Generate a transcript of the audio.'
    
    response = client.models.generate_content(
      model="gemini-3.1-flash-lite-preview",
      contents=[prompt, uploaded_file]
    )
    
    print(response.text)
    
  • 轻量级智能体任务和数据提取:支持实体提取、分类和轻量级数据处理流水线,并以结构化 JSON 格式输出。例如,从电子商务客户评价中提取结构化数据:

    from pydantic import BaseModel, Field
    
    prompt = "Analyze the user review and determine the aspect, sentiment score, summary quote, and return risk"
    input_text = "The boots look amazing and the leather is high quality, but they run way too small. I'm sending them back."
    
    class ReviewAnalysis(BaseModel):
        aspect: str = Field(description="The feature mentioned (e.g., Price, Comfort, Style, Shipping)")
        summary_quote: str = Field(description="The specific phrase from the review about this aspect")
        sentiment_score: int = Field(description="1 to 5 (1=worst, 5=best)")
        is_return_risk: bool = Field(description="True if the user mentions returning the item")
    
    response = client.models.generate_content(
        model="gemini-3.1-flash-lite-preview",
        contents=[prompt, input_text],
        config={
            "response_mime_type": "application/json",
            "response_json_schema": ReviewAnalysis.model_json_schema(),
        },
    )
    
    print(response.text)
    
  • 文档处理和总结:解析 PDF 并返回精炼的摘要,例如用于构建文档处理流水线或快速分诊传入的文件:

    import httpx
    
    # Download a sample PDF document
    doc_url = "https://storage.googleapis.com/generativeai-downloads/data/med_gemini.pdf"
    doc_data = httpx.get(doc_url).content
    
    prompt = "Summarize this document"
    response = client.models.generate_content(
        model="gemini-3.1-flash-lite-preview",
        contents=[
            types.Part.from_bytes(
                data=doc_data,
                mime_type='application/pdf',
            ),
            prompt
        ]
    )
    
    print(response.text)
    
  • 模型路由:使用低延迟且低成本的模型作为分类器,根据任务复杂程度将查询路由到适当的模型。这是生产环境中的一种实际模式 - 开源 Gemini CLI 使用 Flash-Lite 对任务复杂程度进行分类,并相应地将任务路由到 Flash 或 Pro。

    FLASH_MODEL = 'flash'
    PRO_MODEL = 'pro'
    
    CLASSIFIER_SYSTEM_PROMPT = f"""
    You are a specialized Task Routing AI. Your sole function is to analyze the user's request and classify its complexity. Choose between `{FLASH_MODEL}` (SIMPLE) or `{PRO_MODEL}` (COMPLEX).
    1.  `{FLASH_MODEL}`: A fast, efficient model for simple, well-defined tasks.
    2.  `{PRO_MODEL}`: A powerful, advanced model for complex, open-ended, or multi-step tasks.
    
    A task is COMPLEX if it meets ONE OR MORE of the following criteria:
    1.  High Operational Complexity (Est. 4+ Steps/Tool Calls)
    2.  Strategic Planning and Conceptual Design
    3.  High Ambiguity or Large Scope
    4.  Deep Debugging and Root Cause Analysis
    
    A task is SIMPLE if it is highly specific, bounded, and has Low Operational Complexity (Est. 1-3 tool calls).
    """
    
    user_input = "I'm getting an error 'Cannot read property 'map' of undefined' when I click the save button. Can you fix it?"
    
    response_schema = {
      "type": "object",
      "properties": {
        "reasoning": {
          "type": "string",
          "description": "A brief, step-by-step explanation for the model choice, referencing the rubric."
        },
        "model_choice": {
          "type": "string",
          "enum": [FLASH_MODEL, PRO_MODEL]
        }
      },
      "required": ["reasoning", "model_choice"]
    }
    
    response = client.models.generate_content(
        model="gemini-3.1-flash-lite-preview",
        contents=user_input,
        config={
            "system_instruction": CLASSIFIER_SYSTEM_PROMPT,
            "response_mime_type": "application/json",
            "response_json_schema": response_schema
        },
    )
    
    print(response.text)
    
  • 思考:对于需要逐步推理的任务,为了提高准确性,请配置思考,以便模型在生成最终输出之前花费额外的计算资源进行内部推理:

    response = client.models.generate_content(
        model="gemini-3.1-flash-lite-preview",
        contents="How does AI work?",
        config=types.GenerateContentConfig(
            thinking_config=types.ThinkingConfig(thinking_level="high")
        ),
    )
    
    print(response.text)