Gemini Deep Research พร้อมให้บริการในเวอร์ชันพรีวิวแล้วตอนนี้ โดยมีฟีเจอร์การวางแผนร่วมกัน การแสดงภาพข้อมูล การรองรับ MCP และอื่นๆ

Google uses AI technology to translate content into your preferred language. AI translations can contain errors.

คู่มือนักพัฒนาซอฟต์แวร์ Gemini 3

หมายเหตุ: หน้านี้เวอร์ชันนี้ครอบคลุม Interactions API ใหม่ ซึ่งปัจจุบันอยู่ในเวอร์ชันเบต้า
สำหรับการติดตั้งใช้งานจริงที่เสถียร เราขอแนะนำให้คุณใช้ generateContent API ต่อไป คุณใช้ปุ่มเปิด/ปิดในหน้านี้เพื่อสลับระหว่างเวอร์ชันได้

Gemini 3 เป็นตระกูลโมเดลที่ชาญฉลาดที่สุดของเราในปัจจุบัน ซึ่งสร้างขึ้นบนพื้นฐานของ การให้เหตุผลที่ล้ำสมัย โดยออกแบบมาเพื่อทำให้ไอเดียใดๆ เป็นจริงได้ด้วยการ เชี่ยวชาญเวิร์กโฟลว์แบบ Agent การเขียนโค้ดแบบอัตโนมัติ และงานแบบ Multimodal ที่ซับซ้อน คู่มือนี้ครอบคลุมฟีเจอร์หลักของกลุ่มผลิตภัณฑ์โมเดล Gemini 3 และวิธีใช้ประโยชน์จากโมเดลนี้ให้ได้มากที่สุด

สำรวจคอลเล็กชันแอป Gemini 3 เพื่อดูว่าโมเดลจัดการการให้เหตุผลขั้นสูง การเขียนโค้ดอัตโนมัติ และงานมัลติโมดัลที่ซับซ้อนได้อย่างไร

เริ่มต้นใช้งานด้วยโค้ดเพียงไม่กี่บรรทัด

Python

# This will only work for SDK newer than 2.0.0
from google import genai

client = genai.Client()

interaction = client.interactions.create(
    model="gemini-3.1-pro-preview",
    input="Find the race condition in this multi-threaded C++ snippet: [code here]",
)

print(interaction.steps[-1].content[0].text)

JavaScript

// This will only work for SDK newer than 2.0.0
import { GoogleGenAI } from "@google/genai";

const client = new GoogleGenAI({});

async function run() {
  const interaction = await client.interactions.create({
    model: "gemini-3.1-pro-preview",
    input: "Find the race condition in this multi-threaded C++ snippet: [code here]",
  });

  console.log(interaction.steps.at(-1).content[0].text);
}

run();

REST

# Specifies the API revision to avoid breaking changes when they become default
curl -X POST "https://generativelanguage.googleapis.com/v1beta/interactions" \
  -H "x-goog-api-key: $GEMINI_API_KEY" \
  -H 'Content-Type: application/json' \
  -H "Api-Revision: 2026-05-20" \
  -d '{
    "model": "gemini-3.1-pro-preview",
    "input": "Find the race condition in this multi-threaded C++ snippet: [code here]"
  }'

พบกับซีรีส์ Gemini 3

Gemini 3.1 Pro เหมาะที่สุดสำหรับงานที่ซับซ้อนซึ่ง ต้องใช้ความรู้เกี่ยวกับโลกในวงกว้างและการให้เหตุผลขั้นสูงในรูปแบบต่างๆ

Gemini 3 Flash เป็นโมเดลซีรีส์ 3 ล่าสุดของเราที่มีความสามารถอันชาญฉลาดระดับ Pro ใน ความเร็วและราคาของ Flash

Nano Banana Pro (หรือที่เรียกว่ารูปภาพ Gemini 3 Pro) คือโมเดลการสร้างรูปภาพคุณภาพสูงสุดของเรา และ Nano Banana 2 (หรือที่เรียกว่ารูปภาพ Gemini 3.1 Flash) คือโมเดลที่มีปริมาณมาก ประสิทธิภาพสูง และมีราคาต่ำกว่า

Gemini 3.1 Flash-Lite เป็นโมเดลที่ใช้งานได้จริงของเราซึ่งสร้างขึ้นเพื่อโมเดลที่มีประสิทธิภาพด้านต้นทุนและ งานที่มีปริมาณมาก

ปัจจุบันโมเดล Gemini 3 ทั้งหมดอยู่ในการแสดงตัวอย่าง

รหัสโมเดล	หน้าต่างบริบท (เข้า / ออก)	การตัดข้อมูล	การกำหนดราคา (อินพุต / เอาต์พุต)*
gemini-3.1-flash-lite-preview	1M / 64k	มกราคม 2025	$0.25 (ข้อความ, รูปภาพ, วิดีโอ), $0.50 (เสียง) / $1.50
gemini-3.1-flash-image-preview	128k / 32k	มกราคม 2025	$0.25 (อินพุตข้อความ) / $0.067 (เอาต์พุตรูปภาพ)**
gemini-3.1-pro-preview	1M / 64k	มกราคม 2025	$2 / $12 (<200,000 โทเค็น) $4 / $18 (>200,000 โทเค็น)
gemini-3-flash-preview	1M / 64k	มกราคม 2025	$0.50 / $3
gemini-3-pro-image-preview	65,000 / 32,000	มกราคม 2025	$2 (ป้อนข้อความ) / $0.134 (เอาต์พุตรูปภาพ)**

* ราคาต่อโทเค็น 1 ล้านรายการ เว้นแต่จะระบุไว้เป็นอย่างอื่น ** ราคาของรูปภาพจะแตกต่างกันไปตามความละเอียด ดูรายละเอียดได้ที่หน้าการกำหนดราคา

ดูขีดจำกัด ราคา และข้อมูลเพิ่มเติมโดยละเอียดได้ที่หน้าโมเดล

ฟีเจอร์ใหม่ของ API ใน Gemini 3

Gemini 3 มีพารามิเตอร์ใหม่ที่ออกแบบมาเพื่อให้ผู้พัฒนาควบคุมเวลาในการตอบสนอง ต้นทุน และความเที่ยงตรงของโมเดลแบบมัลติโมดัลได้มากขึ้น

ระดับการคิด

โมเดลชุด Gemini 3 ใช้การคิดแบบไดนามิกโดยค่าเริ่มต้นเพื่อหาเหตุผลผ่าน พรอมต์ คุณสามารถใช้พารามิเตอร์ thinking_level ซึ่งควบคุมความลึกสูงสุดของกระบวนการให้เหตุผลภายในของโมเดลก่อนที่จะสร้างคำตอบ Gemini 3 จะถือว่าระดับเหล่านี้เป็นค่าเผื่อสัมพัทธ์สำหรับการคิด มากกว่าการรับประกันโทเค็นที่เข้มงวด

หากไม่ได้ระบุ thinking_level ไว้ Gemini 3 จะใช้ high เป็นค่าเริ่มต้น หากต้องการให้โมเดลตอบกลับเร็วขึ้นและมีเวลาในการตอบสนองที่ต่ำลงเมื่อไม่จำเป็นต้องใช้การให้เหตุผลที่ซับซ้อน คุณสามารถจำกัดระดับการคิดของโมเดลไว้ที่ low ได้

ระดับการคิด	Gemini 3.1 Pro	Gemini 3.1 Flash-Lite	Gemini 3 Flash	คำอธิบาย
`minimal`	สิ่งที่ทำไม่ได้	รองรับ (ค่าเริ่มต้น)	สิ่งที่ทำได้	ตรงกับการตั้งค่า "ไม่ต้องคิด" สำหรับคำค้นหาส่วนใหญ่ โมเดลอาจคิดน้อยมากสำหรับงานการเขียนโค้ดที่ซับซ้อน ลดเวลาในการตอบสนองสำหรับแอปพลิเคชันแชทหรือแอปพลิเคชันที่มีการส่งข้อความปริมาณมาก โปรดทราบว่า `minimal` ไม่รับประกันว่าการคิดจะหยุดทำงาน
`low`	สิ่งที่ทำได้	สิ่งที่ทำได้	สิ่งที่ทำได้	ลดเวลาในการตอบสนองและค่าใช้จ่าย เหมาะที่สุดสำหรับการทำตามคำสั่งง่ายๆ แชท หรือแอปพลิเคชันที่มีปริมาณงานสูง
`medium`	สิ่งที่ทำได้	สิ่งที่ทำได้	สิ่งที่ทำได้	การคิดที่สมดุลสำหรับงานส่วนใหญ่
`high`	รองรับ (ค่าเริ่มต้น, ไดนามิก)	รองรับ (ไดนามิก)	รองรับ (ค่าเริ่มต้น, ไดนามิก)	เพิ่มความลึกในการให้เหตุผลสูงสุด โมเดลอาจใช้เวลานานขึ้นอย่างมากในการ สร้างโทเค็นเอาต์พุตแรก (ไม่ใช่การคิด) แต่เอาต์พุตจะได้รับการพิจารณาอย่างรอบคอบมากขึ้น

Python

# This will only work for SDK newer than 2.0.0
from google import genai

client = genai.Client()

interaction = client.interactions.create(
    model="gemini-3.1-pro-preview",
    input="How does AI work?",
    generation_config={"thinking_level": "low"},
)

print(interaction.steps[-1].content[0].text)

JavaScript

// This will only work for SDK newer than 2.0.0
import { GoogleGenAI } from "@google/genai";

const client = new GoogleGenAI({});

const interaction = await client.interactions.create({
    model: "gemini-3.1-pro-preview",
    input: "How does AI work?",
    generation_config: {
      thinking_level: "low",
    },
  });

console.log(interaction.steps.at(-1).content[0].text);

REST

# Specifies the API revision to avoid breaking changes when they become default
curl -X POST "https://generativelanguage.googleapis.com/v1beta/interactions" \
  -H "x-goog-api-key: $GEMINI_API_KEY" \
  -H 'Content-Type: application/json' \
  -H "Api-Revision: 2026-05-20" \
  -d '{
    "model": "gemini-3.1-pro-preview",
    "input": "How does AI work?",
    "generation_config": {
      "thinking_level": "low"
    }
  }'

อุณหภูมิ

สำหรับโมเดล Gemini 3 ทั้งหมด เราขอแนะนำอย่างยิ่งให้คงพารามิเตอร์อุณหภูมิ ไว้ที่ค่าเริ่มต้น 1.0

แม้ว่าโมเดลก่อนหน้ามักจะได้รับประโยชน์จากการปรับอุณหภูมิเพื่อควบคุมความคิดสร้างสรรค์เทียบกับความแน่นอน แต่ความสามารถในการให้เหตุผลของ Gemini 3 ได้รับการเพิ่มประสิทธิภาพสำหรับการตั้งค่าเริ่มต้น การเปลี่ยนอุณหภูมิ (ตั้งค่าต่ำกว่า 1.0) อาจ ทำให้เกิดลักษณะการทำงานที่ไม่คาดคิด เช่น การวนซ้ำหรือประสิทธิภาพลดลง โดยเฉพาะในงานทางคณิตศาสตร์หรือการให้เหตุผลที่ซับซ้อน

ลายเซ็นความคิด

โมเดล Gemini 3 ใช้ลายเซ็นความคิดเพื่อรักษาบริบทการให้เหตุผลในการเรียก API ลายเซ็นเหล่านี้เป็นการแสดงที่เข้ารหัสของกระบวนการคิดภายในของโมเดล

โหมดเก็บสถานะ (แนะนํา): เมื่อใช้ Interactions API ในโหมดเก็บสถานะ (ระบุ previous_interaction_id) เซิร์ฟเวอร์จะจัดการประวัติการสนทนาและลายเซ็นความคิดโดยอัตโนมัติ
โหมดไม่เก็บสถานะ: หากจัดการประวัติการสนทนาด้วยตนเอง คุณต้องใส่บล็อกความคิดพร้อมลายเซ็นในคำขอที่ตามมาเพื่อตรวจสอบความถูกต้อง

ดูข้อมูลโดยละเอียดได้ที่หน้าลายเซ็นความคิด

เอาต์พุตที่มีโครงสร้างด้วยเครื่องมือ

โมเดล Gemini 3 ช่วยให้คุณรวมเอาต์พุตที่มีโครงสร้างเข้ากับเครื่องมือในตัวได้ ซึ่งรวมถึง การเชื่อมต่อแหล่งข้อมูลกับ Google Search, บริบท URL, การเรียกใช้โค้ด และการเรียกใช้ฟังก์ชัน

Python

# This will only work for SDK newer than 2.0.0
from google import genai
from pydantic import BaseModel, Field
from typing import List

class MatchResult(BaseModel):
    winner: str = Field(description="The name of the winner.")
    final_match_score: str = Field(description="The final match score.")
    scorers: List[str] = Field(description="The name of the scorer.")

client = genai.Client()

interaction = client.interactions.create(
    model="gemini-3.1-pro-preview",
    input="Search for all details for the latest Euro.",
    tools=[
        {"type": "google_search"},
        {"type": "url_context"}
    ],
    response_format={
        "type": "text",
        "mime_type": "application/json",
        "schema": MatchResult.model_json_schema()
    },
)

result = MatchResult.model_validate_json(interaction.steps[-1].content[0].text)
print(result)

JavaScript

// This will only work for SDK newer than 2.0.0
import { GoogleGenAI } from "@google/genai";
import * as z from "zod";

const matchJsonSchema = {
  type: "object",
  properties: {
    winner: { type: "string", description: "The name of the winner." },
    final_match_score: { type: "string", description: "The final score." },
    scorers: {
      type: "array",
      items: { type: "string" },
      description: "The name of the scorer."
    }
  },
  required: ["winner", "final_match_score", "scorers"]
};

const matchSchema = z.fromJSONSchema(matchJsonSchema);

const client = new GoogleGenAI({});

async function run() {
  const interaction = await client.interactions.create({
    model: "gemini-3.1-pro-preview",
    input: "Search for all details for the latest Euro.",
    tools: [
      { type: "google_search" },
      { type: "url_context" }
    ],
    response_format: {
        type: "text",
        mime_type: "application/json",
        schema: matchJsonSchema
    },
  });

  const match = matchSchema.parse(JSON.parse(interaction.steps.at(-1).content[0].text));
  console.log(match);
}

run();

REST

# Specifies the API revision to avoid breaking changes when they become default
curl -X POST "https://generativelanguage.googleapis.com/v1beta/interactions" \
  -H "x-goog-api-key: $GEMINI_API_KEY" \
  -H 'Content-Type: application/json' \
  -H "Api-Revision: 2026-05-20" \
  -d '{
    "model": "gemini-3.1-pro-preview",
    "input": "Search for all details for the latest Euro.",
    "tools": [
      {"type": "google_search"},
      {"type": "url_context"}
    ],
    "response_format": {
        "type": "text",
        "mime_type": "application/json",
        "schema": {
            "type": "object",
            "properties": {
                "winner": {"type": "string", "description": "The name of the winner."},
                "final_match_score": {"type": "string", "description": "The final score."},
                "scorers": {
                    "type": "array",
                    "items": {"type": "string"},
                    "description": "The name of the scorer."
                }
            },
            "required": ["winner", "final_match_score", "scorers"]
        }
    }
  }'

การสร้างรูปภาพ

Gemini 3.1 Flash สำหรับรูปภาพและ Gemini 3 Pro สำหรับรูปภาพช่วยให้คุณสร้างและแก้ไขรูปภาพ จากพรอมต์ข้อความได้ โดยจะใช้การให้เหตุผลเพื่อ "คิด" ตามพรอมต์ และสามารถดึงข้อมูลแบบเรียลไทม์ เช่น พยากรณ์อากาศหรือแผนภูมิหุ้น ก่อนที่จะใช้การอ้างอิงจาก Google Search ก่อนสร้างรูปภาพที่มีความเที่ยงตรงสูง

ความสามารถใหม่และที่ได้รับการปรับปรุง

การแสดงข้อความและ 4K: สร้างข้อความและไดอะแกรมที่คมชัดและอ่านได้ด้วยความละเอียดสูงสุด 2K และ 4K
การสร้างที่อิงตามข้อมูล: ใช้google_searchเครื่องมือเพื่อยืนยันข้อเท็จจริงและ สร้างภาพตามข้อมูลในโลกแห่งความเป็นจริง การอ้างอิงด้วย Google Image Search พร้อมใช้งานสำหรับ Gemini 3.1 Flash Image
การแก้ไขโดยใช้การสนทนา: การแก้ไขรูปภาพแบบหลายรอบโดยเพียงแค่ขอให้ เปลี่ยนแปลง (เช่น "เปลี่ยนพื้นหลังเป็นพระอาทิตย์ตก") เวิร์กโฟลว์นี้ใช้ลายเซ็นความคิดเพื่อรักษาบริบทภาพระหว่างการเลี้ยว

ดูรายละเอียดทั้งหมดเกี่ยวกับสัดส่วนการแสดงผล เวิร์กโฟลว์การแก้ไข และตัวเลือกการกำหนดค่าได้ในคู่มือการสร้างรูปภาพ

Python

# This will only work for SDK newer than 2.0.0
from google import genai
import base64

client = genai.Client()

interaction = client.interactions.create(
    model="gemini-3-pro-image-preview",
    input="Generate an infographic of the current weather in Tokyo.",
    tools=[{"type": "google_search"}],
    response_format={
        "type": "image",
        "aspect_ratio": "16:9",
        "image_size": "4K"
    }
)

from PIL import Image
import io

image_blocks = [content_block for content_block in interaction.steps[-1].content if content_block.type == "image"]
if image_blocks:
    image_data = base64.b64decode(image_blocks[0].data)
    image = Image.open(io.BytesIO(image_data))
    image.save('weather_tokyo.png')
    image.show()

JavaScript

// This will only work for SDK newer than 2.0.0
import { GoogleGenAI } from "@google/genai";
import * as fs from "node:fs";

const client = new GoogleGenAI({});

async function run() {
  const interaction = await client.interactions.create({
    model: "gemini-3-pro-image-preview",
    input: "Generate a visualization of the current weather in Tokyo.",
    tools: [{ type: "google_search" }],
    response_format: {
      type: "image",
      aspect_ratio: "16:9",
      image_size: "4K"
    }
  });

  for (const contentBlock of interaction.steps.at(-1).content) {
    if (contentBlock.type === "image") {
      const buffer = Buffer.from(contentBlock.data, "base64");
      fs.writeFileSync("weather_tokyo.png", buffer);
    }
  }
}

run();

REST

# Specifies the API revision to avoid breaking changes when they become default
curl -X POST "https://generativelanguage.googleapis.com/v1beta/interactions" \
  -H "x-goog-api-key: $GEMINI_API_KEY" \
  -H 'Content-Type: application/json' \
  -H "Api-Revision: 2026-05-20" \
  -d '{
    "model": "gemini-3-pro-image-preview",
    "input": "Generate a visualization of the current weather in Tokyo.",
    "tools": [{"type": "google_search"}],
    "response_format": {
        "type": "image",
        "aspect_ratio": "16:9",
        "image_size": "4K"
    }
  }'

ตัวอย่างคำตอบ

สภาพอากาศ โตเกียว

การรันโค้ดด้วยรูปภาพ

Gemini 3 Flash สามารถมองเห็นเป็นกระบวนการสืบสวนที่ใช้งานได้จริง ไม่ใช่แค่การมองแบบคงที่ การผสานการให้เหตุผลเข้ากับการเรียกใช้โค้ดทำให้โมเดลวางแผน จากนั้นเขียนและเรียกใช้โค้ด Python เพื่อซูมเข้า ครอบตัด ใส่คำอธิบายประกอบ หรือปรับแต่งรูปภาพทีละขั้นตอนเพื่ออ้างอิงคำตอบด้วยภาพ

กรณีการใช้งาน

ซูมและตรวจสอบ: โมเดลจะตรวจหาโดยนัยเมื่อรายละเอียดมีขนาดเล็กเกินไป (เช่น การอ่านมาตรวัดหรือหมายเลขซีเรียลที่อยู่ไกล) และเขียนโค้ดเพื่อครอบตัด และตรวจสอบพื้นที่อีกครั้งที่ความละเอียดสูงขึ้น
คณิตศาสตร์และการลงจุดภาพ: โมเดลสามารถทำการคำนวณหลายขั้นตอนโดยใช้โค้ด (เช่น การรวมรายการในใบเสร็จ หรือการสร้างแผนภูมิ Matplotlib จากข้อมูลที่ดึงมา)
การอธิบายประกอบรูปภาพ: โมเดลสามารถวาดลูกศร กรอบล้อม หรือคำอธิบายประกอบอื่นๆ ลงในรูปภาพโดยตรงเพื่อตอบคำถามเชิงพื้นที่ เช่น "ควรวาง รายการนี้ไว้ที่ไหน"

หากต้องการเปิดใช้การคิดเชิงภาพ ให้กำหนดค่าการเรียกใช้โค้ดเป็นเครื่องมือ โมเดลจะใช้ โค้ดเพื่อปรับแต่งรูปภาพโดยอัตโนมัติเมื่อจำเป็น

Python

# This will only work for SDK newer than 2.0.0
from google import genai
from google.genai import types
import requests
from PIL import Image
import io
import base64

image_path = "https://goo.gle/instrument-img"
image_bytes = requests.get(image_path).content
image = types.Part.from_bytes(data=image_bytes, mime_type="image/jpeg")

client = genai.Client()

interaction = client.interactions.create(
    model="gemini-3-flash-preview",
    input=[
        image,
        "Zoom into the expression pedals and tell me how many pedals are there?"
    ],
    tools=[{"type": "code_execution"}],
)

from IPython.display import display
from PIL import Image
import io

for step in interaction.steps:
    if step.type == "model_output":
        for content_block in step.content:
            if content_block.type == "text":
                print(content_block.text)
            elif content_block.type == "image":
                 display(Image.open(io.BytesIO(base64.b64decode(content_block.data))))
    elif step.type == "code_execution_call":
        print(step.code)
    elif step.type == "code_execution_result":
        print(step.output)

JavaScript

// This will only work for SDK newer than 2.0.0
import { GoogleGenAI } from "@google/genai";

const client = new GoogleGenAI({});

async function main() {
  const imageUrl = "https://goo.gle/instrument-img";
  const response = await fetch(imageUrl);
  const imageArrayBuffer = await response.arrayBuffer();
  const base64ImageData = Buffer.from(imageArrayBuffer).toString("base64");

  const interaction = await client.interactions.create({
    model: "gemini-3-flash-preview",
    input: [
      {
        type: "image",
        mime_type: "image/jpeg",
        data: base64ImageData,
      },
      {
        type: "text",
        text: "Zoom into the expression pedals and tell me how many pedals are there?",
      },
    ],
    tools: [{ type: "code_execution" }],
  });

  for (const step of interaction.steps) {
    if (step.type === "model_output") {
      for (const contentBlock of step.content) {
        if (contentBlock.type === "text") {
          console.log("Text:", contentBlock.text);
        }
      }
    } else if (step.type === "code_execution_call") {
      console.log("Code:", step.code);
    } else if (step.type === "code_execution_result") {
      console.log("Output:", step.output);
    }
  }
}

main();

REST

IMG_URL="https://goo.gle/instrument-img"
MODEL="gemini-3-flash-preview"

MIME_TYPE=$(curl -sIL "$IMG_URL" | grep -i '^content-type:' | awk -F ': ' '{print $2}' | sed 's/\r$//' | head -n 1)
if [[ -z "$MIME_TYPE" || ! "$MIME_TYPE" == image/* ]]; then
  MIME_TYPE="image/jpeg"
fi

if [[ "$(uname)" == "Darwin" ]]; then
  IMAGE_B64=$(curl -sL "$IMG_URL" | base64 -b 0)
elif [[ "$(base64 --version 2>&1)" = *"FreeBSD"* ]]; then
  IMAGE_B64=$(curl -sL "$IMG_URL" | base64)
else
  IMAGE_B64=$(curl -sL "$IMG_URL" | base64 -w0)
fi

# Specifies the API revision to avoid breaking changes when they become default
curl -X POST "https://generativelanguage.googleapis.com/v1beta/interactions" \
    -H "x-goog-api-key: $GEMINI_API_KEY" \
    -H 'Content-Type: application/json' \
    -H "Api-Revision: 2026-05-20" \
    -d '{
      "model": "'$MODEL'",
      "input": [
            {
              "type": "image",
              "mime_type":"'"$MIME_TYPE"'",
              "data": "'"$IMAGE_B64"'"
            },
            {"type": "text", "text": "Zoom into the expression pedals and tell me how many pedals are there?"}
      ],
      "tools": [{"type": "code_execution"}]
    }'

ดูรายละเอียดเพิ่มเติมเกี่ยวกับการเรียกใช้โค้ดด้วยรูปภาพได้ที่การเรียกใช้โค้ด

การตอบกลับฟังก์ชันหลายรูปแบบ

การเรียกใช้ฟังก์ชันแบบมัลติโมดัล ช่วยให้ผู้ใช้ได้รับคำตอบของฟังก์ชันที่มี ออบเจ็กต์มัลติโมดัล ซึ่งช่วยให้ใช้ความสามารถในการเรียกใช้ฟังก์ชัน ของโมเดลได้ดียิ่งขึ้น การเรียกใช้ฟังก์ชันมาตรฐานรองรับเฉพาะการตอบกลับฟังก์ชันที่เป็นข้อความ เท่านั้น

Python

# This will only work for SDK newer than 2.0.0
from google import genai
import requests
import base64

client = genai.Client()

# 1. Define the tool
get_image_tool = {
    "type": "function",
    "name": "get_image",
    "description": "Retrieves the image file reference for a specific order item.",
    "parameters": {
        "type": "object",
        "properties": {
            "item_name": {
                "type": "string",
                "description": "The name or description of the item ordered (e.g., 'instrument')."
            }
        },
        "required": ["item_name"],
    },
}

# 2. Send the request with tools
interaction_1 = client.interactions.create(
    model="gemini-3-flash-preview",
    input="Show me the instrument I ordered last month.",
    tools=[get_image_tool],
)

# 3. Find the function call step
fc_step = next(s for s in interaction_1.steps if s.type == "function_call")
print(f"Tool Call: {fc_step.name}({fc_step.arguments})")

# Execute tool (fetch image)
image_path = "https://goo.gle/instrument-img"
image_bytes = requests.get(image_path).content
image_b64 = base64.b64encode(image_bytes).decode("utf-8")

# 4. Send multimodal function result back
interaction_2 = client.interactions.create(
    model="gemini-3-flash-preview",
    previous_interaction_id=interaction_1.id,
    input=[{
        "type": "function_result",
        "name": fc_step.name,
        "call_id": fc_step.id,
        "result": [
            {"type": "text", "text": "instrument.jpg"},
            {
                "type": "image",
                "mime_type": "image/jpeg",
                "data": image_b64,
            }
        ]
    }],
    tools=[get_image_tool]
)

model_output_step = next(s for s in interaction_2.steps if s.type == "model_output")
print(f"\nFinal model response: {model_output_step.content[0].text}")

JavaScript

// This will only work for SDK newer than 2.0.0
import { GoogleGenAI } from '@google/genai';

const client = new GoogleGenAI({});

// 1. Define the tool
const getImageTool = {
    type: 'function',
    name: 'get_image',
    description: 'Retrieves the image file reference for a specific order item.',
    parameters: {
        type: 'object',
        properties: {
            item_name: {
                type: 'string',
                description: "The name or description of the item ordered (e.g., 'instrument').",
            },
        },
        required: ['item_name'],
    },
};

// 2. Send the request with tools
const interaction1 = await client.interactions.create({
    model: 'gemini-3-flash-preview',
    input: 'Use the get_image tool to show me the instrument I ordered last month.',
    tools: [getImageTool],
});

// 3. Find the function call step
const fcStep = interaction1.steps.find(s => s.type === 'function_call');
console.log(`Tool Call: ${fcStep.name}(${JSON.stringify(fcStep.arguments)})`);

// Execute tool (fetch image)
const imageUrl = 'https://goo.gle/instrument-img';
const response = await fetch(imageUrl);
const imageArrayBuffer = await response.arrayBuffer();
const base64ImageData = Buffer.from(imageArrayBuffer).toString('base64');

// 4. Send multimodal function result back
const interaction2 = await client.interactions.create({
    model: 'gemini-3-flash-preview',
    previous_interaction_id: interaction1.id,
    input: [{
        type: 'function_result',
        name: fcStep.name,
        call_id: fcStep.id,
        result: [
            { type: 'text', text: 'instrument.jpg' },
            {
                type: 'image',
                mime_type: 'image/jpeg',
                data: base64ImageData,
            }
        ]
    }],
    tools: [getImageTool]
});

console.log(`\nFinal model response: ${interaction2.steps.at(-1).content[0].text}`);

REST

IMG_URL="https://goo.gle/instrument-img"

MIME_TYPE=$(curl -sIL "$IMG_URL" | grep -i '^content-type:' | awk -F ': ' '{print $2}' | sed 's/\r$//' | head -n 1)
if [[ -z "$MIME_TYPE" || ! "$MIME_TYPE" == image/* ]]; then
  MIME_TYPE="image/jpeg"
fi

# Check for macOS
if [[ "$(uname)" == "Darwin" ]]; then
  IMAGE_B64=$(curl -sL "$IMG_URL" | base64 -b 0)
elif [[ "$(base64 --version 2>&1)" = *"FreeBSD"* ]]; then
  IMAGE_B64=$(curl -sL "$IMG_URL" | base64)
else
  IMAGE_B64=$(curl -sL "$IMG_URL" | base64 -w0)
fi

# 1. First interaction (triggers function call)
# curl -X POST "https://generativelanguage.googleapis.com/v1beta/interactions" \
#   -H "x-goog-api-key: $GEMINI_API_KEY" \
#   -H 'Content-Type: application/json' \
#   -H "Api-Revision: 2026-05-20" \
#   -d '{ "model": "gemini-3-flash-preview", "input": "Show me the instrument I ordered last month.", "tools": [...] }'

# 2. Send multimodal function result back (Replace INTERACTION_ID and CALL_ID)
# Specifies the API revision to avoid breaking changes when they become default
curl -X POST "https://generativelanguage.googleapis.com/v1beta/interactions" \
  -H "x-goog-api-key: $GEMINI_API_KEY" \
  -H 'Content-Type: application/json' \
  -H "Api-Revision: 2026-05-20" \
  -d '{
    "model": "gemini-3-flash-preview",
    "previous_interaction_id": "INTERACTION_ID",
    "input": [{
      "type": "function_result",
      "name": "get_image",
      "call_id": "CALL_ID",
      "result": [
        { "type": "text", "text": "instrument.jpg" },
        {
          "type": "image",
          "mime_type": "'"$MIME_TYPE"'",
          "data": "'"$IMAGE_B64"'"
        }
      ]
    }]
  }'

รวมเครื่องมือในตัวและการเรียกใช้ฟังก์ชัน

Gemini 3 อนุญาตให้ใช้เครื่องมือในตัว (เช่น Google Search, บริบท URL และอื่นๆ) และเครื่องมือการเรียกใช้ฟังก์ชันที่กำหนดเองในการเรียก API เดียวกัน ซึ่งช่วยให้เวิร์กโฟลว์มีความซับซ้อนมากขึ้น

Python

# This will only work for SDK newer than 2.0.0
from google import genai
from google.genai import types

client = genai.Client()

getWeather = {
    "type": "function",
    "name": "getWeather",
    "description": "Gets the weather for a requested city.",
    "parameters": {
        "type": "object",
        "properties": {
            "city": {
                "type": "string",
                "description": "The city and state, e.g. Utqiaġvik, Alaska",
            },
        },
        "required": ["city"],
    },
}

interaction = client.interactions.create(
    model="gemini-3-flash-preview",
    input="What is the northernmost city in the United States? What's the weather like there today?",
    tools=[
        {"type": "google_search"},
        getWeather
    ],
)

# Find the function call step
fc_step = next((s for s in interaction.steps if s.type == "function_call"), None)

if fc_step:
    # Simulate a function result
    result = {"response": "Very cold. 22 degrees Fahrenheit."}

    final_interaction = client.interactions.create(
        model="gemini-3-flash-preview",
        input=[
            {"type": "function_result", "name": fc_step.name, "call_id": fc_step.id, "result": result}
        ],
        tools=[
            {"type": "google_search"},
            getWeather
        ],
        previous_interaction_id=interaction.id,
    )

    print(final_interaction.steps[-1].content[0].text)

JavaScript

// This will only work for SDK newer than 2.0.0
import { GoogleGenAI, Type } from '@google/genai';

const client = new GoogleGenAI({});

const getWeatherDeclaration = {
  type: 'function',
  name: 'getWeather',
  description: 'Gets the weather for a requested city.',
  parameters: {
    type: Type.OBJECT,
    properties: {
      city: {
        type: Type.STRING,
        description: 'The city and state, e.g. Utqiaġvik, Alaska',
      },
    },
    required: ['city'],
  },
};

const interaction = await client.interactions.create({
  model: 'gemini-3-flash-preview',
  input: "What is the northernmost city in the United States? What's the weather like there today?",
  tools: [
    { type: "google_search" },
    getWeatherDeclaration
  ],
});

// Find the function call step
const fcStep = interaction.steps.find(s => s.type === 'function_call');

if (fcStep) {
  const result = { response: "Very cold. 22 degrees Fahrenheit." };

  const finalInteraction = await client.interactions.create({
    model: 'gemini-3-flash-preview',
    input: [
      { type: 'function_result', name: fcStep.name, call_id: fcStep.id, result: result }
    ],
    tools: [
      { type: "google_search" },
      getWeatherDeclaration
    ],
    previous_interaction_id: interaction.id,
  });

  console.log(finalInteraction.steps.at(-1).content[0].text);
}

การย้ายข้อมูลจาก Gemini 2.5

Gemini 3 เป็นตระกูลโมเดลที่มีความสามารถมากที่สุดของเราในปัจจุบัน และมีการปรับปรุงทีละขั้น เมื่อเทียบกับ Gemini 2.5 เมื่อย้ายข้อมูล ให้พิจารณาสิ่งต่อไปนี้

การคิด: หากก่อนหน้านี้คุณใช้วิศวกรรมพรอมต์ที่ซับซ้อน (เช่น เชนออฟทอท) เพื่อบังคับให้ Gemini 2.5 ให้เหตุผล ให้ลองใช้ Gemini 3 กับ thinking_level: "high" และพรอมต์ที่เรียบง่าย
การตั้งค่าอุณหภูมิ: หากโค้ดที่มีอยู่ตั้งค่าอุณหภูมิอย่างชัดเจน (โดยเฉพาะค่าต่ำสำหรับเอาต์พุตที่แน่นอน) เราขอแนะนำให้นำพารามิเตอร์นี้ออก และใช้ค่าเริ่มต้นของ Gemini 3 ซึ่งคือ 1.0 เพื่อหลีกเลี่ยงปัญหาการวนซ้ำที่อาจเกิดขึ้น หรือประสิทธิภาพที่ลดลงในงานที่ซับซ้อน
การทำความเข้าใจ PDF และเอกสาร: หากคุณอาศัยลักษณะการทำงานที่เฉพาะเจาะจงสำหรับการแยกวิเคราะห์เอกสารที่มีข้อมูลหนาแน่น ให้ทดสอบการตั้งค่า media_resolution_high ใหม่ เพื่อให้มั่นใจว่าข้อมูลจะยังคงถูกต้องต่อไป
การใช้โทเค็น: การย้ายข้อมูลไปใช้ค่าเริ่มต้นของ Gemini 3 อาจเพิ่มการใช้โทเค็นสำหรับ PDF แต่ลดการใช้โทเค็นสำหรับวิดีโอ หากคำขอเกิน หน้าต่างบริบทเนื่องจากความละเอียดเริ่มต้นสูงขึ้น เราขอแนะนำให้ ลดความละเอียดของสื่ออย่างชัดเจน
การแบ่งกลุ่มรูปภาพ: Gemini 3 Pro หรือ Gemini 3 Flash ไม่รองรับความสามารถในการแบ่งกลุ่มรูปภาพ (การแสดงผลมาสก์ระดับพิกเซลสำหรับออบเจ็กต์) สำหรับภาระงานที่ต้องใช้การแบ่งกลุ่มรูปภาพในตัว เราขอแนะนำให้ใช้ Gemini 2.5 Flash ต่อไปโดยปิดการคิด หรือใช้ Gemini Robotics-ER 1.6
การใช้คอมพิวเตอร์: Gemini 3 Pro และ Gemini 3 Flash รองรับการใช้คอมพิวเตอร์ คุณไม่จำเป็นต้องใช้โมเดลแยกต่างหากเพื่อเข้าถึงเครื่องมือการใช้งานคอมพิวเตอร์ ซึ่งต่างจากซีรีส์ 2.5
การรองรับเครื่องมือ: ตอนนี้โมเดล Gemini 3 รองรับการรวมเครื่องมือในตัวเข้ากับการเรียกฟังก์ชันแล้ว นอกจากนี้ โมเดล Gemini 3 ยังรองรับการอ้างอิงจาก Maps ด้วย

ความเข้ากันได้กับ OpenAI

สำหรับผู้ใช้ที่ใช้เลเยอร์ความเข้ากันได้ของ OpenAI ระบบจะแมปพารามิเตอร์มาตรฐาน (reasoning_effort ของ OpenAI) กับ พารามิเตอร์ที่เทียบเท่าของ Gemini (thinking_level) โดยอัตโนมัติ

แนวทางปฏิบัติแนะนำในการเขียนพรอมต์

Gemini 3 เป็นโมเดลการให้เหตุผล ซึ่งจะเปลี่ยนวิธีที่คุณควรใช้พรอมต์

วิธีการที่ชัดเจน: ระบุพรอมต์อินพุตให้กระชับ Gemini 3 ตอบกลับ คำสั่งที่ชัดเจนและตรงไปตรงมาได้ดีที่สุด ซึ่งอาจวิเคราะห์เทคนิควิศวกรรมพรอมต์ (Prompt Engineering) ที่ซับซ้อนหรือมีรายละเอียดมากเกินไปที่ใช้กับโมเดลรุ่นเก่ามากเกินไป
ความละเอียดของเอาต์พุต: โดยค่าเริ่มต้น Gemini 3 จะมีความละเอียดน้อยกว่าและต้องการ ให้คำตอบที่ตรงประเด็นและมีประสิทธิภาพ หาก Use Case ของคุณต้องใช้บุคลิกที่ สนทนาหรือ "ช่างพูด" มากขึ้น คุณต้องชี้นำโมเดลอย่างชัดเจนใน พรอมต์ (เช่น "อธิบายสิ่งนี้ในฐานะผู้ช่วยที่เป็นมิตรและช่างพูด")
การจัดการบริบท: เมื่อทำงานกับชุดข้อมูลขนาดใหญ่ (เช่น หนังสือทั้งเล่ม ฐานโค้ด หรือวิดีโอยาว) ให้วางคำสั่งหรือคำถามที่เฉพาะเจาะจงไว้ท้ายพรอมต์หลังจากบริบทข้อมูล ยึดการให้เหตุผลของโมเดลกับข้อมูลที่ให้ไว้โดยเริ่มคำถามด้วยวลี เช่น "จากข้อมูลก่อนหน้า..."

ดูข้อมูลเพิ่มเติมเกี่ยวกับกลยุทธ์การออกแบบพรอมต์ได้ในคู่มือวิศวกรรมพรอมต์ (Prompt Engineering)

คำถามที่พบบ่อย

การตัดข้อมูลความรู้สำหรับ Gemini 3 คือวันที่เท่าใด โมเดล Gemini 3 มี ข้อมูลล่าสุด ณ เดือนมกราคม 2025 ดูข้อมูลล่าสุดได้ที่เครื่องมือการอ้างอิงการค้นหา
ข้อจำกัดของหน้าต่างบริบทคืออะไร โมเดล Gemini 3 รองรับหน้าต่างบริบทของอินพุตขนาด 1 ล้าน โทเค็นและเอาต์พุตสูงสุด 64,000 โทเค็น
Gemini 3 มีแพ็กเกจฟรีไหม Gemini 3 Flash gemini-3-flash-preview มีระดับฟรีใน Gemini API คุณสามารถทดลองใช้ Gemini 3.1 Pro และ 3 Flash ได้โดยไม่มีค่าใช้จ่ายใน Google AI Studio แต่จะไม่มีระดับฟรีสำหรับ gemini-3.1-pro-preview ใน Gemini API
thinking_budget โค้ดเก่าของฉันจะยังใช้งานได้ไหม ได้ thinking_budget ยังคงรองรับการใช้งานกับเวอร์ชันก่อนหน้า แต่เราขอแนะนำให้ย้ายข้อมูลไปยัง thinking_level เพื่อให้ได้ประสิทธิภาพที่คาดการณ์ได้มากขึ้น แต่อย่าใช้ทั้ง 2 อย่างในคำขอเดียวกัน
Gemini 3 รองรับ Batch API ไหม ได้ Gemini 3 รองรับ Batch API
ระบบรองรับการแคชบริบทไหม ได้ Gemini 3 รองรับการแคชบริบท
Gemini 3 รองรับเครื่องมือใดบ้าง Gemini 3 รองรับ Google Search การเชื่อมต่อแหล่งข้อมูลกับ Google Maps การค้นหาไฟล์ การดำเนินการโค้ด และ บริบท URL นอกจากนี้ยังรองรับการเรียกใช้ฟังก์ชันมาตรฐานสำหรับ เครื่องมือที่กำหนดเองของคุณเอง และใช้ร่วมกับเครื่องมือในตัว
gemini-3.1-pro-preview-customtools คืออะไร หากคุณใช้ gemini-3.1-pro-preview และโมเดลไม่สนใจเครื่องมือที่กำหนดเองของคุณเพื่อใช้คำสั่ง bash ให้ลองใช้โมเดล gemini-3.1-pro-preview-customtools แทน ดูข้อมูลเพิ่มเติมได้[ที่นี่][customtools-model]