使用 Gemini API 生成结构化输出

Gemini 默认会生成非结构化文本，但某些应用需要结构化文本。对于这些用例，您可以限制 Gemini 以 JSON 格式（一种适合自动处理的结构化数据格式）进行响应。您还可以限制模型使用枚举中指定的选项之一进行响应。

以下是可能需要模型提供结构化输出的几种用例：

从报纸文章中提取公司信息，构建公司数据库。
从简历中提取标准化信息。
从食谱中提取食材，并为每种食材显示指向杂货网站的链接。

在问题中，您可以要求 Gemini 生成 JSON 格式的输出，但请注意，模型无法保证只会生成 JSON 格式的输出。如需获得更确定性的响应，您可以在 responseSchema 字段中传递特定的 JSON 架构，以便 Gemini 始终以预期结构响应。如需详细了解如何使用架构，请参阅详细了解 JSON 架构。

本指南介绍了如何通过您选择的 SDK 使用 generateContent 方法生成 JSON，或者直接使用 REST API 生成 JSON。示例显示了纯文本输入，但 Gemini 还可以针对包含图片、视频和音频的多模式请求生成 JSON 响应。

开始前须知：设置项目和 API 密钥

在调用 Gemini API 之前，您需要设置项目并配置 API 密钥。

展开即可查看如何设置项目和 API 密钥

获取 API 密钥并保护其安全

您需要 API 密钥才能调用 Gemini API。如果您还没有 API 密钥，请在 Google AI Studio 中创建一个。

获取 API 密钥

强烈建议不要将 API 密钥签入版本控制系统。

您应为 API 密钥使用 Secret 存储区，例如 Google Cloud Secret Manager。

本教程中的所有代码段都假定您将 API 密钥作为全局常量进行访问。

生成 JSON

将模型配置为输出 JSON 后，它会以 JSON 格式的输出回答任何问题。

您可以通过提供架构来控制 JSON 响应的结构。您可以通过以下两种方式向模型提供架构：

作为提示中的文本
作为通过模型配置提供的结构化架构

在提示中以文本形式提供架构

以下示例会提示模型以特定 JSON 格式返回饼干食谱。

由于模型会从问题中的文本中获取格式规范，因此您在表示规范的方式上可能会有一定的灵活性。任何用于表示 JSON 架构的合理格式都可以使用。

// Make sure to include these imports:
// import { GoogleGenerativeAI } from "@google/generative-ai";
const genAI = new GoogleGenerativeAI(process.env.API_KEY);

const model = genAI.getGenerativeModel({
  model: "gemini-1.5-flash",
});

const prompt = `List a few popular cookie recipes using this JSON schema:

Recipe = {'recipeName': string}
Return: Array<Recipe>`;

const result = await model.generateContent(prompt);
console.log(result.response.text());controlled_generation.js

输出可能如下所示：

[{"recipeName": "Chocolate Chip Cookies"}, {"recipeName": "Oatmeal Raisin Cookies"}, {"recipeName": "Snickerdoodles"}, {"recipeName": "Sugar Cookies"}, {"recipeName": "Peanut Butter Cookies"}]

通过模型配置提供架构

以下示例会执行以下操作：

通过架构实例化配置的模型，以 JSON 格式进行响应。
提示模型返回饼干食谱。

与仅依靠提示中的文本相比，这种更为正式的 JSON 架构声明方法可让您更精确地控制。

// Make sure to include these imports:
// import { GoogleGenerativeAI, SchemaType } from "@google/generative-ai";
const genAI = new GoogleGenerativeAI(process.env.API_KEY);

const schema = {
  description: "List of recipes",
  type: SchemaType.ARRAY,
  items: {
    type: SchemaType.OBJECT,
    properties: {
      recipeName: {
        type: SchemaType.STRING,
        description: "Name of the recipe",
        nullable: false,
      },
    },
    required: ["recipeName"],
  },
};

const model = genAI.getGenerativeModel({
  model: "gemini-1.5-pro",
  generationConfig: {
    responseMimeType: "application/json",
    responseSchema: schema,
  },
});

const result = await model.generateContent(
  "List a few popular cookie recipes.",
);
console.log(result.response.text());controlled_generation.js

输出可能如下所示：

[{"recipeName": "Chocolate Chip Cookies"}, {"recipeName": "Oatmeal Raisin Cookies"}, {"recipeName": "Snickerdoodles"}, {"recipeName": "Sugar Cookies"}, {"recipeName": "Peanut Butter Cookies"}]

有关 JSON 架构的详细信息

将模型配置为返回 JSON 响应时，您可以使用 Schema 对象来定义 JSON 数据的形状。Schema 表示 OpenAPI 3.0 架构对象的选定子集。

下面是所有 Schema 字段的伪 JSON 表示法：

{
  "type": enum (Type),
  "format": string,
  "description": string,
  "nullable": boolean,
  "enum": [
    string
  ],
  "maxItems": string,
  "minItems": string,
  "properties": {
    string: {
      object (Schema)
    },
    ...
  },
  "required": [
    string
  ],
  "propertyOrdering": [
    string
  ],
  "items": {
    object (Schema)
  }
}

架构的 Type 必须是 OpenAPI 数据类型之一。对于每个 Type，只有部分字段有效。以下列表将每个 Type 映射到该类型的有效字段：

string -> 枚举、格式
integer -> 格式
number -> 格式
boolean
array -> minItems、maxItems、items
object -> 属性、必需、propertyOrdering、可为 null

以下是一些示例架构，展示了有效的类型和字段组合：

{ "type": "string", "enum": ["a", "b", "c"] }

{ "type": "string", "format": "date-time" }

{ "type": "integer", "format": "int64" }

{ "type": "number", "format": "double" }

{ "type": "boolean" }

{ "type": "array", "minItems": 3, "maxItems": 3, "items": { "type": ... } }

{ "type": "object",
  "properties": {
    "a": { "type": ... },
    "b": { "type": ... },
    "c": { "type": ... }
  },
  "nullable": true,
  "required": ["c"],
  "propertyOrdering": ["c", "b", "a"]
}

如需详细了解 Gemini API 中使用的架构字段，请参阅架构参考文档。

媒体资源排序

在 Gemini API 中使用 JSON 架构时，属性的顺序很重要。默认情况下，该 API 会按字母顺序对房源进行排序，而不会保留房源的定义顺序（不过 Google Gen AI SDK 可能会保留此顺序）。如果您向配置了架构的模型提供示例，并且示例的属性顺序与架构的属性顺序不一致，则输出可能会杂乱无章或出乎意料。

为确保属性的排序一致且可预测，您可以使用可选的 propertyOrdering[] 字段。

"propertyOrdering": ["recipe_name", "ingredients"]

propertyOrdering[]（不是 OpenAPI 规范中的标准字段）是一个字符串数组，用于确定响应中的属性顺序。通过指定属性的顺序，然后提供包含相同顺序属性的示例，您有望提高结果的质量。