文字生成

Gemini API 可根據各種輸入內容 (包括文字、圖片、影片和音訊) 產生文字輸出內容。本指南說明如何使用文字和圖片輸入內容來產生文字。也涵蓋串流、聊天和系統指示。

文字輸入法

使用 Gemini API 產生文字最簡單的方法,就是為模型提供單一純文字輸入內容,如以下範例所示:

from google import genai

client = genai.Client(api_key="GEMINI_API_KEY")

response = client.models.generate_content(
    model="gemini-2.0-flash",
    contents=["How does AI work?"]
)
print(response.text)
import { GoogleGenAI } from "@google/genai";

const ai = new GoogleGenAI({ apiKey: "GEMINI_API_KEY" });

async function main() {
  const response = await ai.models.generateContent({
    model: "gemini-2.0-flash",
    contents: "How does AI work?",
  });
  console.log(response.text);
}

await main();
// import packages here

func main() {
  ctx := context.Background()
  client, err := genai.NewClient(ctx, option.WithAPIKey(os.Getenv("GEMINI_API_KEY")))
  if err != nil {
    log.Fatal(err)
  }
  defer client.Close()

  model := client.GenerativeModel("gemini-2.0-flash")
  resp, err := model.GenerateContent(ctx, genai.Text("How does AI work?"))
  if err != nil {
    log.Fatal(err)
  }
  printResponse(resp) // helper function for printing content parts
}
curl "https://generativelanguage.googleapis.com/v1beta/models/gemini-2.0-flash:generateContent?key=$GEMINI_API_KEY" \
  -H 'Content-Type: application/json' \
  -X POST \
  -d '{
    "contents": [
      {
        "parts": [
          {
            "text": "How does AI work?"
          }
        ]
      }
    ]
  }'

圖片輸入

Gemini API 支援結合文字和媒體檔案的多模態輸入內容。以下範例說明如何根據文字和圖片輸入內容產生文字:

from PIL import Image
from google import genai

client = genai.Client(api_key="GEMINI_API_KEY")

image = Image.open("/path/to/organ.png")
response = client.models.generate_content(
    model="gemini-2.0-flash",
    contents=[image, "Tell me about this instrument"]
)
print(response.text)
import {
  GoogleGenAI,
  createUserContent,
  createPartFromUri,
} from "@google/genai";

const ai = new GoogleGenAI({ apiKey: "GEMINI_API_KEY" });

async function main() {
  const image = await ai.files.upload({
    file: "/path/to/organ.png",
  });
  const response = await ai.models.generateContent({
    model: "gemini-2.0-flash",
    contents: [
      createUserContent([
        "Tell me about this instrument",
        createPartFromUri(image.uri, image.mimeType),
      ]),
    ],
  });
  console.log(response.text);
}

await main();
model := client.GenerativeModel("gemini-2.0-flash")

imgData, err := os.ReadFile(filepath.Join(testDataDir, "organ.jpg"))
if err != nil {
  log.Fatal(err)
}

resp, err := model.GenerateContent(ctx,
  genai.Text("Tell me about this instrument"),
  genai.ImageData("jpeg", imgData))
if err != nil {
  log.Fatal(err)
}

printResponse(resp)
# Use a temporary file to hold the base64 encoded image data
TEMP_B64=$(mktemp)
trap 'rm -f "$TEMP_B64"' EXIT
base64 $B64FLAGS $IMG_PATH > "$TEMP_B64"

# Use a temporary file to hold the JSON payload
TEMP_JSON=$(mktemp)
trap 'rm -f "$TEMP_JSON"' EXIT

cat > "$TEMP_JSON" << EOF
{
  "contents": [
    {
      "parts": [
        {
          "text": "Tell me about this instrument"
        },
        {
          "inline_data": {
            "mime_type": "image/jpeg",
            "data": "$(cat "$TEMP_B64")"
          }
        }
      ]
    }
  ]
}
EOF

curl "https://generativelanguage.googleapis.com/v1beta/models/gemini-2.0-flash:generateContent?key=$GEMINI_API_KEY" \
  -H 'Content-Type: application/json' \
  -X POST \
  -d "@$TEMP_JSON"

串流輸出

根據預設,模型會在完成整個文字產生程序後傳回回應。您可以使用串流功能,在 GenerateContentResponse 產生時傳回其例項,以便加快互動速度。

from google import genai

client = genai.Client(api_key="GEMINI_API_KEY")

response = client.models.generate_content_stream(
    model="gemini-2.0-flash",
    contents=["Explain how AI works"]
)
for chunk in response:
    print(chunk.text, end="")
import { GoogleGenAI } from "@google/genai";

const ai = new GoogleGenAI({ apiKey: "GEMINI_API_KEY" });

async function main() {
  const response = await ai.models.generateContentStream({
    model: "gemini-2.0-flash",
    contents: "Explain how AI works",
  });

  for await (const chunk of response) {
    console.log(chunk.text);
  }
}

await main();
model := client.GenerativeModel("gemini-1.5-flash")
iter := model.GenerateContentStream(ctx, genai.Text("Write a story about a magic backpack."))
for {
  resp, err := iter.Next()
  if err == iterator.Done {
    break
  }
  if err != nil {
    log.Fatal(err)
  }
  printResponse(resp)
}
curl "https://generativelanguage.googleapis.com/v1beta/models/gemini-2.0-flash:streamGenerateContent?alt=sse&key=${GEMINI_API_KEY}" \
  -H 'Content-Type: application/json' \
  --no-buffer \
  -d '{
    "contents": [
      {
        "parts": [
          {
            "text": "Explain how AI works"
          }
        ]
      }
    ]
  }'

多轉折對話

您可以使用 Gemini SDK 將多輪問題和回覆收集到聊天室中。使用者可透過即時通訊格式逐步取得答案,並獲得多重問題的協助。這個即時通訊 SDK 實作會提供介面,用於追蹤對話記錄,但在幕後,它會使用相同的 generateContent 方法建立回應。

以下程式碼範例顯示基本即時通訊實作方式:

from google import genai

client = genai.Client(api_key="GEMINI_API_KEY")
chat = client.chats.create(model="gemini-2.0-flash")

response = chat.send_message("I have 2 dogs in my house.")
print(response.text)

response = chat.send_message("How many paws are in my house?")
print(response.text)

for message in chat.get_history():
    print(f'role - {message.role}',end=": ")
    print(message.parts[0].text)
import { GoogleGenAI } from "@google/genai";

const ai = new GoogleGenAI({ apiKey: "GEMINI_API_KEY" });

async function main() {
  const chat = ai.chats.create({
    model: "gemini-2.0-flash",
    history: [
      {
        role: "user",
        parts: [{ text: "Hello" }],
      },
      {
        role: "model",
        parts: [{ text: "Great to meet you. What would you like to know?" }],
      },
    ],
  });

  const response1 = await chat.sendMessage({
    message: "I have 2 dogs in my house.",
  });
  console.log("Chat response 1:", response1.text);

  const response2 = await chat.sendMessage({
    message: "How many paws are in my house?",
  });
  console.log("Chat response 2:", response2.text);
}

await main();
model := client.GenerativeModel("gemini-1.5-flash")
cs := model.StartChat()

cs.History = []*genai.Content{
  {
    Parts: []genai.Part{
      genai.Text("Hello, I have 2 dogs in my house."),
    },
    Role: "user",
  },
  {
    Parts: []genai.Part{
      genai.Text("Great to meet you. What would you like to know?"),
    },
    Role: "model",
  },
}

res, err := cs.SendMessage(ctx, genai.Text("How many paws are in my house?"))
if err != nil {
  log.Fatal(err)
}
printResponse(res)
curl https://generativelanguage.googleapis.com/v1beta/models/gemini-2.0-flash:generateContent?key=$GEMINI_API_KEY \
  -H 'Content-Type: application/json' \
  -X POST \
  -d '{
    "contents": [
      {
        "role": "user",
        "parts": [
          {
            "text": "Hello"
          }
        ]
      },
      {
        "role": "model",
        "parts": [
          {
            "text": "Great to meet you. What would you like to know?"
          }
        ]
      },
      {
        "role": "user",
        "parts": [
          {
            "text": "I have two dogs in my house. How many paws are in my house?"
          }
        ]
      }
    ]
  }'

你也可以使用串流直播搭配即時通訊,如以下範例所示:

from google import genai

client = genai.Client(api_key="GEMINI_API_KEY")
chat = client.chats.create(model="gemini-2.0-flash")

response = chat.send_message_stream("I have 2 dogs in my house.")
for chunk in response:
    print(chunk.text, end="")

response = chat.send_message_stream("How many paws are in my house?")
for chunk in response:
    print(chunk.text, end="")

for message in chat.get_history():
    print(f'role - {message.role}', end=": ")
    print(message.parts[0].text)
import { GoogleGenAI } from "@google/genai";

const ai = new GoogleGenAI({ apiKey: "GEMINI_API_KEY" });

async function main() {
  const chat = ai.chats.create({
    model: "gemini-2.0-flash",
    history: [
      {
        role: "user",
        parts: [{ text: "Hello" }],
      },
      {
        role: "model",
        parts: [{ text: "Great to meet you. What would you like to know?" }],
      },
    ],
  });

  const stream1 = await chat.sendMessageStream({
    message: "I have 2 dogs in my house.",
  });
  for await (const chunk of stream1) {
    console.log(chunk.text);
    console.log("_".repeat(80));
  }

  const stream2 = await chat.sendMessageStream({
    message: "How many paws are in my house?",
  });
  for await (const chunk of stream2) {
    console.log(chunk.text);
    console.log("_".repeat(80));
  }
}

await main();
model := client.GenerativeModel("gemini-1.5-flash")
cs := model.StartChat()

cs.History = []*genai.Content{
  {
    Parts: []genai.Part{
      genai.Text("Hello, I have 2 dogs in my house."),
    },
    Role: "user",
  },
  {
    Parts: []genai.Part{
      genai.Text("Great to meet you. What would you like to know?"),
    },
    Role: "model",
  },
}

iter := cs.SendMessageStream(ctx, genai.Text("How many paws are in my house?"))
for {
  resp, err := iter.Next()
  if err == iterator.Done {
    break
  }
  if err != nil {
    log.Fatal(err)
  }
  printResponse(resp)
}
curl https://generativelanguage.googleapis.com/v1beta/models/gemini-2.0-flash:streamGenerateContent?alt=sse&key=$GEMINI_API_KEY \
  -H 'Content-Type: application/json' \
  -X POST \
  -d '{
    "contents": [
      {
        "role": "user",
        "parts": [
          {
            "text": "Hello"
          }
        ]
      },
      {
        "role": "model",
        "parts": [
          {
            "text": "Great to meet you. What would you like to know?"
          }
        ]
      },
      {
        "role": "user",
        "parts": [
          {
            "text": "I have two dogs in my house. How many paws are in my house?"
          }
        ]
      }
    ]
  }'

設定參數

您傳送至模型的每個提示都含有參數,用來控制模型生成回覆的方式,您可以設定這些參數,也可以讓模型使用預設選項。

以下範例說明如何設定模型參數:

from google import genai
from google.genai import types

client = genai.Client(api_key="GEMINI_API_KEY")

response = client.models.generate_content(
    model="gemini-2.0-flash",
    contents=["Explain how AI works"],
    config=types.GenerateContentConfig(
        max_output_tokens=500,
        temperature=0.1
    )
)
print(response.text)
import { GoogleGenAI } from "@google/genai";

const ai = new GoogleGenAI({ apiKey: "GEMINI_API_KEY" });

async function main() {
  const response = await ai.models.generateContent({
    model: "gemini-2.0-flash",
    contents: "Explain how AI works",
    config: {
      maxOutputTokens: 500,
      temperature: 0.1,
    },
  });
  console.log(response.text);
}

await main();
model := client.GenerativeModel("gemini-1.5-pro-latest")
model.SetTemperature(0.9)
model.SetTopP(0.5)
model.SetTopK(20)
model.SetMaxOutputTokens(100)
model.SystemInstruction = genai.NewUserContent(genai.Text("You are Yoda from Star Wars."))
model.ResponseMIMEType = "application/json"
resp, err := model.GenerateContent(ctx, genai.Text("What is the average size of a swallow?"))
if err != nil {
  log.Fatal(err)
}
printResponse(resp)
curl https://generativelanguage.googleapis.com/v1beta/models/gemini-2.0-flash:generateContent?key=$GEMINI_API_KEY \
  -H 'Content-Type: application/json' \
  -X POST \
  -d '{
    "contents": [
      {
        "parts": [
          {
            "text": "Explain how AI works"
          }
        ]
      }
    ],
    "generationConfig": {
      "stopSequences": [
        "Title"
      ],
      "temperature": 1.0,
      "maxOutputTokens": 800,
      "topP": 0.8,
      "topK": 10
    }
  }'

以下列舉一些可設定的模型參數。(命名慣例會因程式設計語言而異)。

  • stopSequences:指定會停止產生輸出的字元序列集合 (最多 5 個)。如果指定,API 會在 stop_sequence 首次出現時停止。這個停止序列不會包含在回應中。
  • temperature:控制輸出的隨機性。如要取得更有創意的回覆,請使用較高的值;如要取得較確定的回覆,請使用較低的值。值的範圍為 [0.0, 2.0]。
  • maxOutputTokens:設定候選項中可納入的符記數量上限。
  • topP:變更模型選取輸出符記的方式。模型會按照可能性最高到最低的順序選取符記,直到所選符記的機率總和等於 topP 值。預設的 topP 值為 0.95。
  • topK:變更模型選取輸出符記的方式。如果 topK 設為 1,代表所選詞元是模型詞彙表的所有詞元中可能性最高者。如果 topK 設為 3,則代表模型會依據溫度參數,從可能性最高的 3 個詞元中選取下一個詞元。接著進一步根據 topP 篩選符記,最後依 temperature 選出最終符記。

系統指示

系統指令可讓您根據特定用途調整模型的行為。提供系統指示時,您會向模型提供額外脈絡資訊,協助模型瞭解任務並生成更符合需求的回覆。模型應在與使用者互動時全程遵守系統指示,讓您能單獨指定產品層級行為,不受使用者提供的提示影響。

您可以在初始化模型時設定系統指令:

from google import genai
from google.genai import types

client = genai.Client(api_key="GEMINI_API_KEY")

response = client.models.generate_content(
    model="gemini-2.0-flash",
    config=types.GenerateContentConfig(
        system_instruction="You are a cat. Your name is Neko."),
    contents="Hello there"
)

print(response.text)
import { GoogleGenAI } from "@google/genai";

const ai = new GoogleGenAI({ apiKey: "GEMINI_API_KEY" });

async function main() {
  const response = await ai.models.generateContent({
    model: "gemini-2.0-flash",
    contents: "Hello there",
    config: {
      systemInstruction: "You are a cat. Your name is Neko.",
    },
  });
  console.log(response.text);
}

await main();
// import packages here

func main() {
  ctx := context.Background()
  client, err := genai.NewClient(ctx, option.WithAPIKey(os.Getenv("GEMINI_API_KEY")))
  if err != nil {
    log.Fatal(err)
  }
  defer client.Close()

  model := client.GenerativeModel("gemini-2.0-flash")
  model.SystemInstruction = &genai.Content{
    Parts: []genai.Part{genai.Text(`
      You are a cat. Your name is Neko.
    `)},
  }
  resp, err := model.GenerateContent(ctx, genai.Text("Hello there"))
  if err != nil {
    log.Fatal(err)
  }
  printResponse(resp) // helper function for printing content parts
}
curl "https://generativelanguage.googleapis.com/v1beta/models/gemini-2.0-flash:generateContent?key=$GEMINI_API_KEY" \
  -H 'Content-Type: application/json' \
  -d '{
    "system_instruction": {
      "parts": [
        {
          "text": "You are a cat. Your name is Neko."
        }
      ]
    },
    "contents": [
      {
        "parts": [
          {
            "text": "Hello there"
          }
        ]
      }
    ]
  }'

接著,您可以照常向模型傳送要求。

支援的模型

整個 Gemini 系列模型都支援文字生成功能。如要進一步瞭解模型及其功能,請參閱「模型」。

提示訣竅

對於基本文字生成用途,提示可能不需要包含任何輸出範例、系統操作說明或格式資訊。這是零樣本方法。在某些用途中,單拍少拍提示可能會產生更符合使用者期待的輸出內容。在某些情況下,您可能還需要提供系統指示,協助模型瞭解任務或遵循特定指引。

後續步驟