ตอนนี้ Interactions API พร้อมให้บริการแก่ผู้ใช้ทั่วไปแล้ว เราขอแนะนำให้ใช้ API นี้เพื่อเข้าถึงฟีเจอร์และโมเดลล่าสุดทั้งหมด

Google uses AI technology to translate content into your preferred language. AI translations can contain errors.

กำลังคิด

โมเดล Gemini 3 และ 2.5 ใช้ "กระบวนการคิด" ที่ช่วยเพิ่มความสามารถในการให้เหตุผลและการวางแผนแบบหลายขั้นตอนได้อย่างมาก จึงทำให้โมเดลมีประสิทธิภาพสูงสำหรับงานที่ซับซ้อน เช่น การเขียนโค้ด คณิตศาสตร์ขั้นสูง และการวิเคราะห์ข้อมูล

เมื่อคุณใช้โมเดลการคิด Gemini จะใช้เหตุผลภายในก่อนที่จะตอบ Interactions API จะแสดงเหตุผลนี้ผ่านthoughtขั้นตอน ซึ่งเป็นขั้นตอนเฉพาะที่ปรากฏตามลำดับเวลาควบคู่ไปกับการเรียกใช้ฟังก์ชัน อินพุตของผู้ใช้ หรือเอาต์พุตของโมเดลในstepsอาร์เรย์

ขั้นตอนการคิดทุกขั้นตอนจะมี 2 ฟิลด์ ดังนี้

ช่อง	ต้องระบุ	คำอธิบาย
`signature`	✅ ใช่	การแสดงที่เข้ารหัสของสถานะการให้เหตุผลภายในของโมเดล แสดงเสมอแม้ว่าโมเดลจะใช้การให้เหตุผลน้อยที่สุดก็ตาม
`summary`	❌ ไม่	อาร์เรย์ของเนื้อหา (ข้อความและ/หรือรูปภาพ) ที่สรุปการให้เหตุผล อาจว่างเปล่าขึ้นอยู่กับการกำหนดค่า `thinking_summaries` ว่าโมเดลให้เหตุผลเพียงพอหรือไม่ หรือประเภทเนื้อหา (เช่น รูปภาพที่ซ่อนอยู่อาจไม่มีสรุปข้อความ)

หมายเหตุ: Interactions API จัดการความคิดและลายเซ็นแตกต่างจาก generateContent API
- ใน generateContent API ไม่มีบล็อกความคิดเฉพาะ ด้วยเหตุนี้ ลายเซ็นจึงเป็นข้อมูลเมตาที่แนบไปกับส่วนใดก็ได้ เช่น อยู่ภายในfunctionCallหรือส่วนสุดท้ายของคำตอบ
- ใน Interactions API ความคิดคือการแสดงผลระดับเฟิร์สคลาสในรูปแบบthoughtขั้นตอนเฉพาะ ด้วยเหตุนี้ ลายเซ็นจึงจำกัดไว้เฉพาะ 2 ตำแหน่งที่ทราบ thought ขั้นตอน หรือขั้นตอนเครื่องมือในตัว (เช่น google_search_call/ google_search_result) โดยจะไม่ปรากฏในอินพุตของผู้ใช้ เอาต์พุตของโมเดล หรือการเรียกฟังก์ชันมาตรฐาน

การโต้ตอบกับโมเดลการคิด

การเริ่มโต้ตอบกับโมเดลการคิดจะคล้ายกับการขอโต้ตอบอื่นๆ ระบุโมเดลที่รองรับการคิดในช่อง model

Python

from google import genai

client = genai.Client()

interaction = client.interactions.create(
    model="gemini-3.5-flash",
    input="Explain the concept of Occam's Razor and provide a simple, everyday example."
)
print(interaction.output_text)

JavaScript

import { GoogleGenAI } from "@google/genai";

const client = new GoogleGenAI({});

const interaction = await client.interactions.create({
    model: "gemini-3.5-flash",
    input: "Explain the concept of Occam's Razor and provide a simple, everyday example."
});
console.log(interaction.output_text);

REST

curl -X POST "https://generativelanguage.googleapis.com/v1beta/interactions" \
  -H "x-goog-api-key: $GEMINI_API_KEY" \
  -H 'Content-Type: application/json' \
  -d '{
    "model": "gemini-3.5-flash",
    "input": "Explain the concept of Occam'\''s Razor and provide a simple example."
  }'

สรุปความคิด

สรุปความคิดให้ข้อมูลเชิงลึกเกี่ยวกับกระบวนการให้เหตุผลภายในของโมเดล โดยค่าเริ่มต้น ระบบจะแสดงเฉพาะเอาต์พุตสุดท้าย คุณเปิดใช้สรุปความคิดได้ ด้วย thinking_summaries โดยทำดังนี้

Python

from google import genai

client = genai.Client()

interaction = client.interactions.create(
    model="gemini-3.5-flash",
    input="What is the sum of the first 50 prime numbers?",
    generation_config={
        "thinking_summaries": "auto"
    }
)

for step in interaction.steps:
    if step.type == "thought":
        print("Thought summary:")
        if step.summary:
            for content_block in step.summary:
                if content_block.type == "text":
                    print(content_block.text)
        print()
    elif step.type == "model_output":
        for content_block in step.content:
            if content_block.type == "text":
                print("Answer:")
                print(content_block.text)
                print()

JavaScript

import { GoogleGenAI } from "@google/genai";

const client = new GoogleGenAI({});

const interaction = await client.interactions.create({
    model: "gemini-3.5-flash",
    input: "What is the sum of the first 50 prime numbers?",
    generation_config: {
        thinking_summaries: "auto"
    }
});

for (const step of interaction.steps) {
    if (step.type === "thought") {
        console.log("Thought summary:");
        if (step.summary) {
            for (const contentBlock of step.summary) {
                if (contentBlock.type === "text") console.log(contentBlock.text);
            }
        }
    } else if (step.type === "model_output") {
        for (const contentBlock of step.content) {
            if (contentBlock.type === "text") {
                console.log("Answer:");
                console.log(contentBlock.text);
            }
        }
    }
}

REST

curl -X POST "https://generativelanguage.googleapis.com/v1beta/interactions" \
  -H "x-goog-api-key: $GEMINI_API_KEY" \
  -H 'Content-Type: application/json' \
  -d '{
    "model": "gemini-3.5-flash",
    "input": "What is the sum of the first 50 prime numbers?",
    "generation_config": {
      "thinking_summaries": "auto"
    }
  }'

บล็อกความคิดอาจมีเฉพาะลายเซ็นที่ไม่มีสรุปในกรณีต่อไปนี้

คำขอที่เรียบง่าย ซึ่งโมเดลให้เหตุผลไม่เพียงพอที่จะสร้างข้อมูลสรุป
thinking_summaries: "none" ซึ่งปิดใช้สรุปอย่างชัดเจน
เนื้อหาบางประเภท เช่น รูปภาพ อาจไม่มีสรุปข้อความ

โค้ดของคุณควรจัดการบล็อกความคิดที่ summary ว่างเปล่าหรือไม่มีอยู่เสมอ

การสตรีมพร้อมการคิด

ใช้การสตรีมเพื่อรับสรุปความคิดที่เพิ่มขึ้นในระหว่างการสร้าง ระบบจะส่งบล็อกความคิดโดยใช้ Server-Sent Events (SSE) ที่มี ประเภทเดลต้าที่แตกต่างกัน 2 ประเภท ดังนี้

ประเภทเดลต้า	มี	เมื่อส่ง
`thought_summary`	เนื้อหาสรุปข้อความหรือรูปภาพ	เดลต้าอย่างน้อย 1 รายการที่มีข้อมูลสรุปแบบเพิ่ม
`thought_signature`	ลายเซ็นการเข้ารหัส	เดลต้าล่าสุดก่อน `step.stop`

Python

from google import genai

client = genai.Client()

prompt = """
Alice, Bob, and Carol each live in a different house on the same street: red, green, and blue.
Alice does not live in the red house.
Bob does not live in the green house.
Carol does not live in the red or green house.
Which house does each person live in?
"""

thoughts = ""
answer = ""

stream = client.interactions.create(
    model="gemini-3.5-flash",
    input=prompt,
    generation_config={
        "thinking_summaries": "auto"
    },
    stream=True
)

for event in stream:
    if event.event_type == "step.delta":
        if event.delta.type == "thought_summary":
            if not thoughts:
                print("Thinking...")
            summary_text = event.delta.content.text
            print(f"[Thought] {summary_text}", end="")
            thoughts += summary_text
        elif event.delta.type == "text" and event.delta.text:
            if not answer:
                print("\nAnswer:")
            print(event.delta.text, end="")
            answer += event.delta.text

JavaScript

import { GoogleGenAI } from "@google/genai";

const client = new GoogleGenAI({});

const prompt = `Alice, Bob, and Carol each live in a different house on the same
street: red, green, and blue. Alice does not live in the red house.
Bob does not live in the green house.
Carol does not live in the red or green house.
Which house does each person live in?`;

let thoughts = "";
let answer = "";

const stream = await client.interactions.create({
    model: "gemini-3.5-flash",
    input: prompt,
    generation_config: {
        thinking_summaries: "auto"
    },
    stream: true
});

for await (const event of stream) {
    if (event.event_type === "step.delta") {
        if (event.delta.type === "thought_summary") {
            if (!thoughts) console.log("Thinking...");
            const text = event.delta.content?.text || "";
            process.stdout.write(`[Thought] ${text}`);
            thoughts += text;
        } else if (event.delta.type === "text" && event.delta.text) {
            if (!answer) console.log("\nAnswer:");
            process.stdout.write(event.delta.text);
            answer += event.delta.text;
        }
    }
}

REST

curl -X POST "https://generativelanguage.googleapis.com/v1beta/interactions" \
  -H "x-goog-api-key: $GEMINI_API_KEY" \
  -H 'Content-Type: application/json' \
  --no-buffer \
  -d '{
    "model": "gemini-3.5-flash",
    "input": "Alice, Bob, and Carol each live in a different house on the same street: red, green, and blue. Alice does not live in the red house. Bob does not live in the green house. Carol does not live in the red or green house. Which house does each person live in?",
    "generation_config": {
      "thinking_summaries": "auto"
    },
    "stream": true
  }'

การตอบกลับแบบสตรีมใช้ Server-Sent Events (SSE) และประกอบด้วยขั้นตอน และเหตุการณ์ เช่น

event: interaction.created
data: {"interaction":{"id":"v1_xxx","status":"in_progress","object":"interaction","model":"gemini-3.5-flash"},"event_type":"interaction.created"}

event: step.start
data: {"index":0,"step":{"signature":"","summary":[{"text":"**Evaluating the clues**\n\nI'm considering...","type":"text"}],"type":"thought"},"event_type":"step.start"}

event: step.delta
data: {"index":0,"delta":{"signature":"EpoGCpcGAXLI2nx/...","type":"thought_signature"},"event_type":"step.delta"}

event: step.stop
data: {"index":0,"event_type":"step.stop"}

event: step.start
data: {"index":1,"step":{"content":[{"text":"Based on the clues provided, here","type":"text"}],"type":"model_output"},"event_type":"step.start"}

event: step.delta
data: {"index":1,"delta":{"text":" is the answer to your question...","type":"text"},"event_type":"step.delta"}

event: step.stop
data: {"index":1,"event_type":"step.stop"}

event: interaction.completed
data: {"interaction":{"id":"v1_xxx","status":"completed","usage":{"total_tokens":530,"total_input_tokens":62,"total_output_tokens":171,"total_thought_tokens":297}},"event_type":"interaction.completed"}

event: done
data: [DONE]

การควบคุมความคิด

โมเดล Gemini จะใช้การคิดแบบไดนามิกโดยค่าเริ่มต้น ซึ่งจะปรับ ความพยายามในการให้เหตุผลโดยอัตโนมัติตามความซับซ้อนของคำขอ คุณควบคุมลักษณะการทำงานนี้ได้โดยใช้พารามิเตอร์ thinking_level

รุ่น	การคิดตามค่าเริ่มต้น	ระดับที่รองรับ
gemini-3.1-pro-preview	เปิด (สูง)	ต่ำ ปานกลาง สูง
gemini-3.1-flash-lite-image	เปิด (น้อยที่สุด)	ต่ำ สูง
gemini-3-flash-preview	เปิด (สูง)	ต่ำสุด ต่ำ ปานกลาง สูง
gemini-3-pro-preview	เปิด (สูง)	ต่ำ สูง
gemini-3.5-flash	เปิด (ปานกลาง)	ต่ำสุด ต่ำ ปานกลาง สูง
gemini-2.5-pro	เปิด	ต่ำ ปานกลาง สูง
gemini-2.5-flash	เปิด	ต่ำ ปานกลาง สูง
gemini-2.5-flash-lite	ปิด	ต่ำ ปานกลาง สูง

Python

from google import genai

client = genai.Client()

interaction = client.interactions.create(
    model="gemini-3.5-flash",
    input="Provide a list of 3 famous physicists and their key contributions",
    generation_config={
        "thinking_level": "low"
    }
)
print(interaction.output_text)

JavaScript

import { GoogleGenAI } from "@google/genai";

const client = new GoogleGenAI({});

const interaction = await client.interactions.create({
    model: "gemini-3.5-flash",
    input: "Provide a list of 3 famous physicists and their key contributions",
    generation_config: {
        thinking_level: "low"
    }
});
console.log(interaction.output_text);

REST

curl -X POST "https://generativelanguage.googleapis.com/v1beta/interactions" \
  -H "x-goog-api-key: $GEMINI_API_KEY" \
  -H 'Content-Type: application/json' \
  -d '{
    "model": "gemini-3.5-flash",
    "input": "Provide a list of 3 famous physicists and their key contributions",
    "generation_config": {
      "thinking_level": "low"
    }
  }'

ลายเซ็นความคิด

ลายเซ็นความคิดคือการแสดงการให้เหตุผลภายในของโมเดลที่เข้ารหัส โดยโมเดลจะต้องรักษาความต่อเนื่องของการให้เหตุผลในการสนทนาไปมา

Interactions API ช่วยให้การจัดการลายเซ็นความคิดง่ายกว่า generateContent API มาก

โหมด Stateful (แนะนำ)

โดยค่าเริ่มต้น เมื่อคุณใช้ Interactions API ในโหมด Stateful (โดยการตั้งค่า store: true และส่ง previous_interaction_id ในเทิร์นถัดไป) เซิร์ฟเวอร์จะจัดการสถานะการสนทนาโดยอัตโนมัติ ซึ่งรวมถึงบล็อกความคิดและลายเซ็นทั้งหมด ในโหมดนี้ คุณไม่จำเป็นต้องดำเนินการใดๆ เกี่ยวกับลายเซ็น โดยระบบจะจัดการทั้งหมดที่ฝั่งเซิร์ฟเวอร์

โหมดแบบไม่เก็บสถานะ

หากคุณจัดการสถานะการสนทนาด้วยตนเอง (โหมดไม่เก็บสถานะ) และส่งประวัติอินพุตและเอาต์พุตทั้งหมดในแต่ละคำขอ ให้ทำดังนี้

คุณต้องส่งบล็อก thought ทั้งหมดอีกครั้งทุกครั้งตามที่ได้รับจากโมเดล
คุณไม่ควรนำบล็อกความคิดออกจากประวัติหรือแก้ไขบล็อกดังกล่าว เนื่องจากมีลายเซ็นที่โมเดลต้องใช้เพื่อดำเนินการให้เหตุผลต่อไป
เมื่อเปลี่ยนรูปแบบภายในเซสชัน คุณควรส่งบล็อกความคิดของโมเดลก่อนหน้าอีกครั้ง ส่วนแบ็กเอนด์จะจัดการความเข้ากันได้

ราคา

เมื่อเปิดใช้การคิด ราคาการตอบกลับจะเป็นผลรวมของโทเค็นเอาต์พุตและโทเค็นการคิด คุณดูจำนวนโทเค็นความคิดที่สร้างขึ้นทั้งหมดได้จากฟิลด์ total_thought_tokens

Python

print("Thoughts tokens:", interaction.usage.total_thought_tokens)
print("Output tokens:", interaction.usage.total_output_tokens)

JavaScript

console.log(`Thoughts tokens: ${interaction.usage.total_thought_tokens}`);
console.log(`Output tokens: ${interaction.usage.total_output_tokens}`);

โมเดลการคิดจะสร้างความคิดที่สมบูรณ์เพื่อปรับปรุงคุณภาพของคำตอบสุดท้าย จากนั้นจะแสดงข้อมูลสรุปเพื่อให้ข้อมูลเชิงลึกเกี่ยวกับ กระบวนการคิด ราคาจะอิงตามโทเค็นความคิดทั้งหมดที่โมเดลต้องใช้ ในการสร้าง แม้ว่า API จะแสดงเฉพาะข้อมูลสรุปก็ตาม

ดูข้อมูลเพิ่มเติมเกี่ยวกับโทเค็นได้ในคู่มือการนับโทเค็น

แนวทางปฏิบัติแนะนำ

ใช้โมเดลการคิดอย่างมีประสิทธิภาพโดยทำตามหลักเกณฑ์ต่อไปนี้

ตรวจสอบการให้เหตุผล: วิเคราะห์สรุปความคิดเพื่อทำความเข้าใจข้อผิดพลาดและปรับปรุงพรอมต์
ควบคุมงบประมาณการคิด: แจ้งให้โมเดลคิดน้อยลงสำหรับเอาต์พุตที่ยาวเพื่อประหยัดโทเค็น
งานที่เรียบง่าย: ใช้การคิดน้อยหรือต่ำสำหรับการดึงข้อมูลข้อเท็จจริงหรือการจัดประเภท (เช่น "DeepMind ก่อตั้งขึ้นที่ไหน")
งานการกลั่นกรอง: ใช้การคิดเริ่มต้นเพื่อเปรียบเทียบแนวคิดหรือการให้เหตุผลเชิงสร้างสรรค์ (เช่น เปรียบเทียบรถยนต์ไฟฟ้าและรถยนต์ไฮบริด)
งานที่ซับซ้อน: ใช้การคิดสูงสุดสำหรับการเขียนโค้ดขั้นสูง คณิตศาสตร์ หรือการวางแผนแบบหลายขั้นตอน (เช่น แก้โจทย์คณิตศาสตร์ AIME)

ขั้นตอนถัดไป

การสร้างข้อความ: คำตอบที่เป็นข้อความพื้นฐาน
การเรียกใช้ฟังก์ชัน: เชื่อมต่อกับเครื่องมือ
คู่มือ Gemini 3: ฟีเจอร์เฉพาะรุ่น