Gemini 思考

Gemini 3 和 2.5 系列模型採用內部「思考過程」，大幅提升推論和多步驟規劃能力，因此非常適合處理複雜工作，例如程式設計、高等數學和資料分析。

本指南說明如何使用 Gemini API，運用 Gemini 的思考能力。

生成有思考過程的內容

使用思考模型發起要求，與任何其他內容生成要求類似。主要差異在於 model 欄位中指定了支援思考的其中一個模型，如下列文字生成範例所示：

Python

from google import genai

client = genai.Client()
prompt = "Explain the concept of Occam's Razor and provide a simple, everyday example."
response = client.models.generate_content(
    model="gemini-3-flash-preview",
    contents=prompt
)

print(response.text)

JavaScript

import { GoogleGenAI } from "@google/genai";

const ai = new GoogleGenAI({});

async function main() {
  const prompt = "Explain the concept of Occam's Razor and provide a simple, everyday example.";

  const response = await ai.models.generateContent({
    model: "gemini-3-flash-preview",
    contents: prompt,
  });

  console.log(response.text);
}

main();

Go

package main

import (
  "context"
  "fmt"
  "log"
  "os"
  "google.golang.org/genai"
)

func main() {
  ctx := context.Background()
  client, err := genai.NewClient(ctx, nil)
  if err != nil {
      log.Fatal(err)
  }

  prompt := "Explain the concept of Occam's Razor and provide a simple, everyday example."
  model := "gemini-3-flash-preview"

  resp, _ := client.Models.GenerateContent(ctx, model, genai.Text(prompt), nil)

  fmt.Println(resp.Text())
}

REST

curl "https://generativelanguage.googleapis.com/v1beta/models/gemini-3-flash-preview:generateContent" \
 -H "x-goog-api-key: $GEMINI_API_KEY" \
 -H 'Content-Type: application/json' \
 -X POST \
 -d '{
   "contents": [
     {
       "parts": [
         {
           "text": "Explain the concept of Occam'\''s Razor and provide a simple, everyday example."
         }
       ]
     }
   ]
 }'
 ```

想法摘要

想法摘要是模型原始想法的摘要版本，可深入瞭解模型的內部推論過程。請注意，思考程度和預算適用於模型的原始想法，而非想法摘要。

如要在要求設定中將 includeThoughts 設為 true，請啟用想法摘要。接著，您可以透過 response 參數的 parts 進行疊代，並檢查 thought 布林值，存取摘要。

以下範例說明如何啟用及擷取想法摘要 (不含串流)，並在回應中傳回單一最終想法摘要：

Python

from google import genai
from google.genai import types

client = genai.Client()
prompt = "What is the sum of the first 50 prime numbers?"
response = client.models.generate_content(
  model="gemini-3-flash-preview",
  contents=prompt,
  config=types.GenerateContentConfig(
    thinking_config=types.ThinkingConfig(
      include_thoughts=True
    )
  )
)

for part in response.candidates[0].content.parts:
  if not part.text:
    continue
  if part.thought:
    print("Thought summary:")
    print(part.text)
    print()
  else:
    print("Answer:")
    print(part.text)
    print()

JavaScript

import { GoogleGenAI } from "@google/genai";

const ai = new GoogleGenAI({});

async function main() {
  const response = await ai.models.generateContent({
    model: "gemini-3-flash-preview",
    contents: "What is the sum of the first 50 prime numbers?",
    config: {
      thinkingConfig: {
        includeThoughts: true,
      },
    },
  });

  for (const part of response.candidates[0].content.parts) {
    if (!part.text) {
      continue;
    }
    else if (part.thought) {
      console.log("Thoughts summary:");
      console.log(part.text);
    }
    else {
      console.log("Answer:");
      console.log(part.text);
    }
  }
}

main();

Go

package main

import (
  "context"
  "fmt"
  "google.golang.org/genai"
  "os"
)

func main() {
  ctx := context.Background()
  client, err := genai.NewClient(ctx, nil)
  if err != nil {
      log.Fatal(err)
  }

  contents := genai.Text("What is the sum of the first 50 prime numbers?")
  model := "gemini-3-flash-preview"
  resp, _ := client.Models.GenerateContent(ctx, model, contents, &genai.GenerateContentConfig{
    ThinkingConfig: &genai.ThinkingConfig{
      IncludeThoughts: true,
    },
  })

  for _, part := range resp.Candidates[0].Content.Parts {
    if part.Text != "" {
      if part.Thought {
        fmt.Println("Thoughts Summary:")
        fmt.Println(part.Text)
      } else {
        fmt.Println("Answer:")
        fmt.Println(part.Text)
      }
    }
  }
}

以下是使用串流思考的範例，會在生成期間傳回滾動式增量摘要：

Python

from google import genai
from google.genai import types

client = genai.Client()

prompt = """
Alice, Bob, and Carol each live in a different house on the same street: red, green, and blue.
The person who lives in the red house owns a cat.
Bob does not live in the green house.
Carol owns a dog.
The green house is to the left of the red house.
Alice does not own a cat.
Who lives in each house, and what pet do they own?
"""

thoughts = ""
answer = ""

for chunk in client.models.generate_content_stream(
    model="gemini-3-flash-preview",
    contents=prompt,
    config=types.GenerateContentConfig(
      thinking_config=types.ThinkingConfig(
        include_thoughts=True
      )
    )
):
  for part in chunk.candidates[0].content.parts:
    if not part.text:
      continue
    elif part.thought:
      if not thoughts:
        print("Thoughts summary:")
      print(part.text)
      thoughts += part.text
    else:
      if not answer:
        print("Answer:")
      print(part.text)
      answer += part.text

JavaScript

import { GoogleGenAI } from "@google/genai";

const ai = new GoogleGenAI({});

const prompt = `Alice, Bob, and Carol each live in a different house on the same
street: red, green, and blue. The person who lives in the red house owns a cat.
Bob does not live in the green house. Carol owns a dog. The green house is to
the left of the red house. Alice does not own a cat. Who lives in each house,
and what pet do they own?`;

let thoughts = "";
let answer = "";

async function main() {
  const response = await ai.models.generateContentStream({
    model: "gemini-3-flash-preview",
    contents: prompt,
    config: {
      thinkingConfig: {
        includeThoughts: true,
      },
    },
  });

  for await (const chunk of response) {
    for (const part of chunk.candidates[0].content.parts) {
      if (!part.text) {
        continue;
      } else if (part.thought) {
        if (!thoughts) {
          console.log("Thoughts summary:");
        }
        console.log(part.text);
        thoughts = thoughts + part.text;
      } else {
        if (!answer) {
          console.log("Answer:");
        }
        console.log(part.text);
        answer = answer + part.text;
      }
    }
  }
}

await main();

Go

package main

import (
  "context"
  "fmt"
  "log"
  "os"
  "google.golang.org/genai"
)

const prompt = `
Alice, Bob, and Carol each live in a different house on the same street: red, green, and blue.
The person who lives in the red house owns a cat.
Bob does not live in the green house.
Carol owns a dog.
The green house is to the left of the red house.
Alice does not own a cat.
Who lives in each house, and what pet do they own?
`

func main() {
  ctx := context.Background()
  client, err := genai.NewClient(ctx, nil)
  if err != nil {
      log.Fatal(err)
  }

  contents := genai.Text(prompt)
  model := "gemini-3-flash-preview"

  resp := client.Models.GenerateContentStream(ctx, model, contents, &genai.GenerateContentConfig{
    ThinkingConfig: &genai.ThinkingConfig{
      IncludeThoughts: true,
    },
  })

  for chunk := range resp {
    for _, part := range chunk.Candidates[0].Content.Parts {
      if len(part.Text) == 0 {
        continue
      }

      if part.Thought {
        fmt.Printf("Thought: %s\n", part.Text)
      } else {
        fmt.Printf("Answer: %s\n", part.Text)
      }
    }
  }
}

控制思考

Gemini 模型預設會進行動態思考，根據使用者要求的複雜程度自動調整推理量。不過，如果您有特定的延遲限制，或需要模型進行比平常更深入的推理，可以視需要使用參數來控制思考行為。

思考程度 (Gemini 3)

建議搭配 Gemini 3 模型和後續版本使用 thinkingLevel 參數，藉此控制推論行為。

下表詳細列出各模型類型的 thinkingLevel 設定：

思考程度	Gemini 3 Pro	Gemini 3 Flash	說明
`minimal`	不支援	支援	與大多數查詢的「不思考」設定相符。模型可能會以極簡思維處理複雜的程式碼工作。盡量減少聊天或高處理量應用程式的延遲。請注意，`minimal` 無法保證系統會停止思考。
`low`	支援	支援	盡量縮短延遲時間並降低成本。最適合簡單的指令遵循、即時通訊或高總處理量應用程式。
`medium`	不支援	支援	適合用於大多數工作。
`high`	支援 (預設、動態)	支援 (預設、動態)	盡可能深入推理。模型可能需要較長時間才能輸出第一個 (非思考) 輸出權杖，但輸出內容會經過更仔細的推理。

以下範例說明如何設定思考層級。

Python

from google import genai
from google.genai import types

client = genai.Client()

response = client.models.generate_content(
    model="gemini-3-flash-preview",
    contents="Provide a list of 3 famous physicists and their key contributions",
    config=types.GenerateContentConfig(
        thinking_config=types.ThinkingConfig(thinking_level="low")
    ),
)

print(response.text)

JavaScript

import { GoogleGenAI, ThinkingLevel } from "@google/genai";

const ai = new GoogleGenAI({});

async function main() {
  const response = await ai.models.generateContent({
    model: "gemini-3-flash-preview",
    contents: "Provide a list of 3 famous physicists and their key contributions",
    config: {
      thinkingConfig: {
        thinkingLevel: ThinkingLevel.LOW,
      },
    },
  });

  console.log(response.text);
}

main();

Go

package main

import (
  "context"
  "fmt"
  "google.golang.org/genai"
  "os"
)

func main() {
  ctx := context.Background()
  client, err := genai.NewClient(ctx, nil)
  if err != nil {
      log.Fatal(err)
  }

  thinkingLevelVal := "low"

  contents := genai.Text("Provide a list of 3 famous physicists and their key contributions")
  model := "gemini-3-flash-preview"
  resp, _ := client.Models.GenerateContent(ctx, model, contents, &genai.GenerateContentConfig{
    ThinkingConfig: &genai.ThinkingConfig{
      ThinkingLevel: &thinkingLevelVal,
    },
  })

fmt.Println(resp.Text())
}

REST

curl "https://generativelanguage.googleapis.com/v1beta/models/gemini-3-flash-preview:generateContent" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-H 'Content-Type: application/json' \
-X POST \
-d '{
  "contents": [
    {
      "parts": [
        {
          "text": "Provide a list of 3 famous physicists and their key contributions"
        }
      ]
    }
  ],
  "generationConfig": {
    "thinkingConfig": {
          "thinkingLevel": "low"
    }
  }
}'

你無法停用 Gemini 3 Pro 的思考功能。Gemini 3 Flash 也不支援完全關閉思考功能，但 minimal 設定表示模型可能不會思考 (但仍有可能)。如未指定思考程度，Gemini 會使用 Gemini 3 模型預設的動態思考程度 "high"。

Gemini 2.5 系列模型不支援 thinkingLevel，請改用 thinkingBudget。

思考預算

Gemini 2.5 系列推出的 thinkingBudget 參數，可引導模型使用特定數量的思考詞元進行推論。

以下是各模型類型的thinkingBudget設定詳細資料。如要停用思考功能，請將 thinkingBudget 設為 0。將 thinkingBudget 設為 -1 可啟用動態思考，也就是模型會根據要求的複雜度調整預算。

型號	預設設定 (未設定思考預算)	範圍	停用思考	開啟動態思考
2.5 Pro	動態思考	`128` 至 `32768`	不適用：無法停用思考功能	`thinkingBudget = -1` (預設)
2.5 Flash	動態思考	`0` 至 `24576`	`thinkingBudget = 0`	`thinkingBudget = -1` (預設)
2.5 Flash 預先發布版	動態思考	`0` 至 `24576`	`thinkingBudget = 0`	`thinkingBudget = -1` (預設)
2.5 Flash Lite	模型不會思考	`512` 至 `24576`	`thinkingBudget = 0`	`thinkingBudget = -1`
2.5 Flash Lite 預先發布版	模型不會思考	`512` 至 `24576`	`thinkingBudget = 0`	`thinkingBudget = -1`
Robotics-ER 1.5 預先發布版	動態思考	`0` 至 `24576`	`thinkingBudget = 0`	`thinkingBudget = -1` (預設)
2.5 Flash Live Native Audio Preview (09-2025)	動態思考	`0` 至 `24576`	`thinkingBudget = 0`	`thinkingBudget = -1` (預設)

Python

from google import genai
from google.genai import types

client = genai.Client()

response = client.models.generate_content(
    model="gemini-3-flash-preview",
    contents="Provide a list of 3 famous physicists and their key contributions",
    config=types.GenerateContentConfig(
        thinking_config=types.ThinkingConfig(thinking_budget=1024)
        # Turn off thinking:
        # thinking_config=types.ThinkingConfig(thinking_budget=0)
        # Turn on dynamic thinking:
        # thinking_config=types.ThinkingConfig(thinking_budget=-1)
    ),
)

print(response.text)

JavaScript

import { GoogleGenAI } from "@google/genai";

const ai = new GoogleGenAI({});

async function main() {
  const response = await ai.models.generateContent({
    model: "gemini-3-flash-preview",
    contents: "Provide a list of 3 famous physicists and their key contributions",
    config: {
      thinkingConfig: {
        thinkingBudget: 1024,
        // Turn off thinking:
        // thinkingBudget: 0
        // Turn on dynamic thinking:
        // thinkingBudget: -1
      },
    },
  });

  console.log(response.text);
}

main();

Go

package main

import (
  "context"
  "fmt"
  "google.golang.org/genai"
  "os"
)

func main() {
  ctx := context.Background()
  client, err := genai.NewClient(ctx, nil)
  if err != nil {
      log.Fatal(err)
  }

  thinkingBudgetVal := int32(1024)

  contents := genai.Text("Provide a list of 3 famous physicists and their key contributions")
  model := "gemini-3-flash-preview"
  resp, _ := client.Models.GenerateContent(ctx, model, contents, &genai.GenerateContentConfig{
    ThinkingConfig: &genai.ThinkingConfig{
      ThinkingBudget: &thinkingBudgetVal,
      // Turn off thinking:
      // ThinkingBudget: int32(0),
      // Turn on dynamic thinking:
      // ThinkingBudget: int32(-1),
    },
  })

fmt.Println(resp.Text())
}

REST

curl "https://generativelanguage.googleapis.com/v1beta/models/gemini-3-flash-preview:generateContent" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-H 'Content-Type: application/json' \
-X POST \
-d '{
  "contents": [
    {
      "parts": [
        {
          "text": "Provide a list of 3 famous physicists and their key contributions"
        }
      ]
    }
  ],
  "generationConfig": {
    "thinkingConfig": {
          "thinkingBudget": 1024
    }
  }
}'

視提示而定，模型可能會超出或未用完權杖預算。

思想簽名

Gemini API 是無狀態的，因此模型會獨立處理每個 API 要求，且無法存取多輪互動中先前回合的思考脈絡。

如要讓 Gemini 在多輪互動中維持思考脈絡，Gemini 會傳回思考簽章，這是模型內部思考過程的加密表示法。

啟用思考功能，且要求包含函式呼叫 (具體來說是函式宣告) 時，Gemini 2.5 模型會傳回思考簽章。
Gemini 3 模型可能會傳回所有類型部分的思考簽章。建議您一律將所有簽章傳回，但函式呼叫簽章必須傳回。詳情請參閱「想法簽章」頁面。

注意： 即使 Gemini Flash 3 設為 minimal，仍須流通思想簽章。

Google GenAI SDK 會自動處理思維簽章的回傳作業。只有在修改對話記錄或使用 REST API 時，才需要手動管理想法簽章。

使用函式呼叫時，還需注意以下限制：

簽章會與其他部分一起傳回，例如函式呼叫或文字部分。將整個回覆的所有部分，在後續回合中傳回模型。
請勿將簽章與其他部分串連在一起。
請勿將有簽名的部分與沒有簽名的部分合併。

定價

開啟思考功能後，回覆價格會是輸出詞元和思考詞元的總和。您可以從 thoughtsTokenCount 欄位取得產生的思考權杖總數。

Python

# ...
print("Thoughts tokens:",response.usage_metadata.thoughts_token_count)
print("Output tokens:",response.usage_metadata.candidates_token_count)

JavaScript

// ...
console.log(`Thoughts tokens: ${response.usageMetadata.thoughtsTokenCount}`);
console.log(`Output tokens: ${response.usageMetadata.candidatesTokenCount}`);

Go

// ...
usageMetadata, err := json.MarshalIndent(response.UsageMetadata, "", "  ")
if err != nil {
  log.Fatal(err)
}
fmt.Println("Thoughts tokens:", string(usageMetadata.thoughts_token_count))
fmt.Println("Output tokens:", string(usageMetadata.candidates_token_count))

思考模型會生成完整想法，提升最終回覆的品質，然後輸出摘要，深入瞭解思考過程。因此，即使 API 只會輸出摘要，但計費依據仍是模型生成摘要時所需的所有思考權杖。

如要進一步瞭解權杖，請參閱權杖計數指南。

最佳做法

本節提供一些指引，說明如何有效運用思考模型。一如往常，請遵循提示指南和最佳做法，獲得最佳結果。

偵錯和引導

查看推理過程：如果思考模型未提供預期回覆，請仔細分析 Gemini 的思考摘要。你可以查看系統如何分解工作並得出結論，然後根據這些資訊修正結果。
在推理過程中提供指引：如果希望輸出內容特別長，建議在提示中提供指引，限制模型思考的量。這樣一來，就能為回覆保留更多權杖輸出。

工作複雜度

簡單工作 (可關閉思考)：對於不需要複雜推理的簡單要求 (例如擷取事實或分類)，不需要思考。例如：
- 「DeepMind 是在哪裡成立的？」
- 「這封電子郵件是要安排會議，還是只是提供資訊？」
中等工作 (預設/需要思考)：許多常見要求需要逐步處理或深入瞭解。Gemini 可彈性運用思考能力處理下列工作：
- 以光合作用和成長過程做類比。
- 比較電動車和油電混合車的異同。
困難任務 (最高思維能力)：如要解決複雜的數學問題或編碼工作等高難度挑戰，建議設定較高的思維預算。這類工作需要模型充分運用推理和規劃能力，通常在提供答案前會經過許多內部步驟。例如：
- 解決 2025 年 AIME 的問題 1：找出所有大於 9 的整數底數 b，使 17_b 是 97_b 的除數，並計算這些底數的總和。
- 編寫 Python 程式碼，建立可顯示即時股市資料的網路應用程式，包括使用者驗證。盡可能提高效率。

支援的模型、工具和功能

所有 3 和 2.5 系列模型都支援思考功能。您可以在模型總覽頁面中查看所有模型功能。

思考模型可搭配所有 Gemini 工具和功能使用。這可讓模型與外部系統互動、執行程式碼或存取即時資訊，並將結果納入推論和最終回覆。

如需使用工具搭配思考型模型的範例，請參閱思考型模型食譜。

後續步驟

如需涵蓋範圍，請參閱 OpenAI 相容性指南。