Gemini Deep Research jest teraz dostępna w wersji testowej z funkcjami planowania współpracy, wizualizacji, obsługi MCP i nie tylko.

Google uses AI technology to translate content into your preferred language. AI translations can contain errors.

Łączenie wbudowanych narzędzi i wywoływania funkcji

Gemini umożliwia łączenie wbudowanych narzędzi, takich jak google_search, i wywoływanie funkcji (znanych też jako narzędzia niestandardowe) w ramach jednej generacji przez zachowywanie i udostępnianie historii kontekstu wywołań narzędzi. Wbudowane i niestandardowe kombinacje narzędzi umożliwiają tworzenie złożonych przepływów pracy opartych na agentach, w których np. model może opierać się na danych internetowych w czasie rzeczywistym przed wywołaniem konkretnej logiki biznesowej.

Oto przykład, który umożliwia korzystanie z wbudowanych i niestandardowych kombinacji narzędzi z użyciem google_search i funkcji niestandardowej getWeather:

Python

from google import genai
from google.genai import types

client = genai.Client()

getWeather = {
    "name": "getWeather",
    "description": "Gets the weather for a requested city.",
    "parameters": {
        "type": "object",
        "properties": {
            "city": {
                "type": "string",
                "description": "The city and state, e.g. Utqiaġvik, Alaska",
            },
        },
        "required": ["city"],
    },
}

# Turn 1: Initial request with Google Search (built-in) and getWeather (custom) tools enabled
response = client.models.generate_content(
    model="gemini-3.5-flash",
    contents="What is the northernmost city in the United States? What's the weather like there today?",
    config=types.GenerateContentConfig(
        tools=[
            types.Tool(
                google_search=types.GoogleSearch(),  # Built-in tool
                function_declarations=[getWeather]       # Custom tool
            ),
        ],
        tool_config=types.ToolConfig(
            include_server_side_tool_invocations=True
        )
    ),
)
function_call_id = None
for part in response.candidates[0].content.parts:
    if part.function_call:
        print(f"Function call: {part.function_call.name} (ID: {part.function_call.id})")
        function_call_id = part.function_call.id

# Turn 2: Manually build history to circulate both tool and function context
history = [
    types.Content(
        role="user",
        parts=[types.Part(text="What is the northernmost city in the United States? What's the weather like there today?")]
    ),
    # Response from Turn 1 includes tool_call, tool_response, and thought_signatures
    response.candidates[0].content,
    # Return the function_response
    types.Content(
        role="user",
        parts=[types.Part(
            function_response=types.FunctionResponse(
                name="getWeather",
                response={"response": "Very cold. 22 degrees Fahrenheit."},
                id=function_call_id # Match the ID from the function_call
            )
        )]
    )
]

response_2 = client.models.generate_content(
    model="gemini-3.5-flash",
    contents=history,
    config=types.GenerateContentConfig(
        tools=[
            types.Tool(
                google_search=types.GoogleSearch(),
                function_declarations=[getWeather]
            ),
        ],
        # This flag needs to be enabled for built-in tool context circulation and tool combination
        tool_config=types.ToolConfig(
            include_server_side_tool_invocations=True
        )
    ),
)

for part in response_2.candidates[0].content.parts:
    if part.text:
        print(part.text)

JavaScript

import { GoogleGenAI } from '@google/genai';

const client = new GoogleGenAI({});

const getWeather = {
    name: "getWeather",
    description: "Get the weather in a given location",
    parameters: {
        type: "OBJECT",
        properties: {
            location: {
                type: "STRING",
                description: "The city and state, e.g. San Francisco, CA"
            }
        },
        required: ["location"]
    }
};

async function run() {
    const model = client.getGenerativeModel({
        model: "gemini-3.5-flash",
    });

    const tools = [
      { googleSearch: {} },
      { functionDeclarations: [getWeather] }
    ];
    // This flag needs to be enabled for built-in tool context circulation and tool combination
    const toolConfig = { includeServerSideToolInvocations: true };

    // Turn 1: Initial request with Google Search (built-in) and getWeather (custom) tools enabled
    const result1 = await model.generateContent({
        contents: [{role: "user", parts: [{text: "What is the northernmost city in the United States? What's the weather like there today?"}]}],
        tools: tools,
        toolConfig: toolConfig,
    });

    const response1 = result1.response;

    for (const part of response1.candidates[0].content.parts) {
        if (part.functionCall) {
            console.log(`Function call: ${part.functionCall.name} (ID: ${part.functionCall.id})`);
        }
    }

    const functionCallId = response1.candidates[0].content.parts.find(p => p.functionCall)?.functionCall?.id;

    // Turn 2: Manually build history to circulate both tool and function context
    const history = [
        {
            role: "user",
            parts:[{text: "What is the northernmost city in the United States? What's the weather like there today?"}]
        },
        // Response from Turn 1 includes tool_call, tool_response, and thought_signatures
        response1.candidates[0].content,
        // Return the function_response
        {
            role: "user",
            parts: [{
                functionResponse: {
                    name: "getWeather",
                    response: {response: "Very cold. 22 degrees Fahrenheit."},
                    id: functionCallId // Match the ID from the function_call
                }
            }]
        }
    ];

    const result2 = await model.generateContent({
        contents: history,
        tools: tools,
        toolConfig: toolConfig,
    });

    for (const part of result2.response.candidates[0].content.parts) {
        if (part.text) {
            console.log(part.text);
        }
    }
}

run();

Go

package main

import (
    "context"
    "fmt"
    "log"
    "os"

    "github.com/google/generative-ai-go/genai"
    "google.golang.org/api/option"
)

func main() {
    ctx := context.Background()
    client, err := genai.NewClient(ctx, option.WithAPIKey(os.Getenv("GEMINI_API_KEY")))
    if err != nil {
        log.Exit(err)
    }
    defer client.Close()

    getWeather := &genai.FunctionDeclaration{
        Name:        "getWeather",
        Description: "Get the weather in a given location",
        Parameters: &genai.Schema{
            Type: genai.Object,
            Properties: map[string]*genai.Schema{
                "location": {
                    Type:        genai.String,
                    Description: "The city and state, e.g. San Francisco, CA",
                },
            },
            Required: []string{"location"},
        },
    }

    model := client.GenerativeModel("gemini-3.5-flash")
    model.Tools = []*genai.Tool{
        {GoogleSearch: &genai.GoogleSearch{}}, // Built-in tool
        {FunctionDeclarations: []*genai.FunctionDeclaration{getWeather}}, // Custom tool
    }
    ist := true
    model.ToolConfig = &genai.ToolConfig{
        IncludeServerSideToolInvocations: &ist, // This flag needs to be enabled for built-in tool context circulation and tool combination
    }

    chat := model.StartChat()

    // Turn 1: Initial request with Google Search (built-in) and getWeather (custom) tools enabled
    prompt := genai.Text("What is the northernmost city in the United States? What's the weather like there today?")
    resp1, err := chat.SendMessage(ctx, prompt)
    if err != nil {
        log.Exitf("SendMessage failed: %v", err)
    }

    if resp1 == nil || len(resp1.Candidates) == 0 || resp1.Candidates[0].Content == nil {
        log.Exit("empty response from model")
    }

    var functionCallID string
    for _, part := range resp1.Candidates[0].Content.Parts {
        switch p := part.(type) {
        case genai.FunctionCall:
            fmt.Printf("Function call: %s (ID: %s)\n", p.Name, p.ID)
            if p.Name == "getWeather" {
                functionCallID = p.ID
            }
        }
    }

    if functionCallID == "" {
        log.Exit("no getWeather function call in response")
    }

    // Turn 2: Provide function result back to model.
    // Chat history automatically includes tool_call, tool_response, and thought_signatures from Turn 1.
    fr := genai.FunctionResponse{
        Name: "getWeather",
        ID:   functionCallID,
        Response: map[string]any{
            "response": "Very cold. 22 degrees Fahrenheit.",
        },
    }

    resp2, err := chat.SendMessage(ctx, fr)
    if err != nil {
        log.Exitf("SendMessage for turn 2 failed: %v", err)
    }

    if resp2 == nil || len(resp2.Candidates) == 0 || resp2.Candidates[0].Content == nil {
        log.Exit("empty response from model in turn 2")
    }

    for _, part := range resp2.Candidates[0].Content.Parts {
        if txt, ok := part.(genai.Text); ok {
            fmt.Println(string(txt))
        }
    }
}

REST

# Turn 1: Initial request with Google Search (built-in) and getWeather (custom) tools enabled
curl -X POST "https://generativelanguage.googleapis.com/v1beta/models/gemini-3.5-flash:generateContent" \
-H "Content-Type: application/json" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-d '{
  "contents": [{
    "role": "user",
    "parts": [{
      "text": "What is the northernmost city in the United States? What'\''s the weather like there today?"
    }]
  }],
  "tools": [{
    "googleSearch": {}
  }, {
    "functionDeclarations": [{
      "name": "getWeather",
      "description": "Get the weather in a given location",
      "parameters": {
          "type": "OBJECT",
          "properties": {
              "location": {
                  "type": "STRING",
                  "description": "The city and state, e.g. San Francisco, CA"
              }
          },
          "required": ["location"]
      }
    }]
  }],
  "toolConfig": {
    "includeServerSideToolInvocations": true
  }
}'

# Turn 2: Manually build history to circulate both tool and function context
# The following request assumes you have captured candidates[0].content from Turn 1 response,
# and extracted function_call.id for getWeather.
# Replace FUNCTION_CALL_ID and insert candidate content from turn 1.
curl -X POST "https://generativelanguage.googleapis.com/v1beta/models/gemini-3.5-flash:generateContent" \
-H "Content-Type: application/json" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-d '{
  "contents": [
    {
      "role": "user",
      "parts": [{"text": "What is the northernmost city in the United States? What'\''s the weather like there today?"}]
    },
    YOUR_CANDIDATE_CONTENT_FROM_TURN_1_RESPONSE,
    {
      "role": "user",
      "parts": [{
        "functionResponse": {
          "name": "getWeather",
          "id": "FUNCTION_CALL_ID",
          "response": {"response": "Very cold. 22 degrees Fahrenheit."}
        }
      }]
    }
  ],
  "tools": [{
    "googleSearch": {}
  }, {
    "functionDeclarations": [{
      "name": "getWeather",
      "description": "Get the weather in a given location",
      "parameters": {
          "type": "OBJECT",
          "properties": {
              "location": {
                  "type": "STRING",
                  "description": "The city and state, e.g. San Francisco, CA"
              }
          },
          "required": ["location"]
      }
    }]
  }],
  "toolConfig": {
    "includeServerSideToolInvocations": true
  }
}'

Jak to działa

Modele Gemini 3 korzystają z obiegu kontekstu narzędzia, aby umożliwić wbudowane i niestandardowe kombinacje narzędzi. Dzięki przekazywaniu kontekstu narzędzia można zachować i udostępnić kontekst wbudowanych narzędzi oraz udostępnić go narzędziom niestandardowym w ramach tego samego wywołania.

Włączanie kombinacji narzędzi

Aby włączyć przekazywanie kontekstu narzędzia, musisz ustawić flagę include_server_side_tool_invocations na true.
Dołącz function_declarations oraz wbudowane narzędzia, których chcesz używać, aby wywołać kombinację działań.
- Jeśli nie uwzględnisz function_declarations, kontekst narzędzia będzie nadal działać w przypadku wbudowanych narzędzi, o ile flaga jest ustawiona.

Interfejs API zwraca części

W jednej odpowiedzi interfejs API zwraca części toolCall i toolResponse wywołania wbudowanego narzędzia. W przypadku wywołania funkcji (niestandardowego narzędzia) interfejs API zwraca część functionCall wywołania, do której użytkownik w następnej kolejce dodaje część functionResponse.

toolCall i toolResponse: interfejs API zwraca te części, aby zachować kontekst narzędzi uruchamianych po stronie serwera i wynik ich wykonania na potrzeby kolejnej tury.
functionCall i functionResponse: interfejs API wysyła do użytkownika wywołanie funkcji, aby mógł ją wypełnić, a użytkownik odsyła wynik w odpowiedzi funkcji (te części są standardowe dla wszystkich wywołań funkcji w interfejsie Gemini API, a nie tylko dla funkcji łączenia narzędzi).
(Tylko narzędzie Wykonywanie kodu)executableCode i codeExecutionResult: w przypadku narzędzia Wykonywanie kodu zamiast functionCall i functionResponse interfejs API zwraca executableCode (kod wygenerowany przez model, który ma zostać wykonany) i codeExecutionResult (wynik wykonania kodu).

W każdej turze musisz zwrócić do modelu wszystkie części, w tym wszystkie pola, które zawierają, aby zachować kontekst i umożliwić kombinacje narzędzi.

Krytyczne pola w zwróconych częściach

Niektóre części zwracane przez interfejs API będą zawierać pola id, tool_type i thought_signature. Te pola są kluczowe dla zachowania kontekstu narzędzia (a tym samym dla kombinacji narzędzi). W kolejnych żądaniach musisz zwracać wszystkie części w formie podanej w odpowiedzi.

id: unikalny identyfikator, który przypisuje wywołanie do jego odpowiedzi. id jest ustawiona we wszystkich odpowiedziach na wywołania funkcji, niezależnie od obiegu kontekstu narzędzia. W odpowiedzi funkcji musisz podać ten sam parametr id, który interfejs API podaje w wywołaniu funkcji. Wbudowane narzędzia automatycznie dzielą id między wywołanie narzędzia a odpowiedź narzędzia.
- Występuje we wszystkich częściach związanych z narzędziami: toolCall, toolResponse, functionCall, functionResponse, executableCode, codeExecutionResult
tool_type: określa konkretne używane narzędzie, czyli wbudowane narzędzie (np. URL_CONTEXT) lub nazwę funkcji (np. getWeather).
- Znaleziono w toolCall i toolResponse.
thought_signature: rzeczywisty zaszyfrowany kontekst osadzony w każdej części zwracanej przez interfejs API. Bez sygnatur myśli nie można odtworzyć kontekstu. Jeśli w każdej turze nie zwrócisz sygnatur myśli dla wszystkich części, model zwróci błąd.
- Znaleziono we wszystkich częściach.

Dane dotyczące narzędzia

Niektóre wbudowane narzędzia zwracają argumenty danych widoczne dla użytkownika, które są specyficzne dla danego typu narzędzia.

Narzędzie	Argumenty wywołania narzędzia widoczne dla użytkownika (jeśli występują)	Odpowiedź narzędzia widoczna dla użytkownika (jeśli występuje)
GOOGLE_SEARCH	`queries`	`search_suggestions`
GOOGLE_MAPS	`queries`	`places` `google_maps_widget_context_token`
URL_CONTEXT	`urls` Adresy URL do przeglądania	`urls_metadata` `retrieved_url`: Przeglądane adresy URL `url_retrieval_status`: Stan przeglądania
FILE_SEARCH	Brak	Brak

Przykładowa struktura żądania kombinacji narzędzi

Poniższa struktura żądania pokazuje strukturę żądania prompta: „What is the northernmost city in the United States? Jaka jest tam dzisiaj pogoda?”. Łączy 3 narzędzia: wbudowane narzędzia Gemini google_search i code_execution oraz funkcję niestandardową get_weather.

{
  "model": "models/gemini-3.5-flash",
  "contents": [{
    "parts": [{
      "text": "What is the northernmost city in the United States? What's the weather like there today?"
    }],
    "role": "user"
  }, {
    "parts": [{
      "thoughtSignature": "...",
      "toolCall": {
        "toolType": "GOOGLE_SEARCH_WEB",
        "args": {
          "queries": ["northernmost city in the United States"]
        },
        "id": "a7b3k9p2"
      }
    }, {
      "thoughtSignature": "...",
      "toolResponse": {
        "toolType": "GOOGLE_SEARCH_WEB",
        "response": {
          "search_suggestions": "..."
        },
        "id": "a7b3k9p2"
      }
    }, {
      "functionCall": {
        "name": "getWeather",
        "args": {
          "city": "Utqiaġvik, Alaska"
        },
        "id": "m4q8z1v6"
      },
      "thoughtSignature": "..."
    }],
    "role": "model"
  }, {
    "parts": [{
      "functionResponse": {
        "name": "getWeather",
        "response": {
          "response": "Very cold. 22 degrees Fahrenheit."
        },
        "id": "m4q8z1v6"
      }
    }],
    "role": "user"
  }],
  "tools": [{
    "functionDeclarations": [{
      "name": "getWeather"
    }]
  }, {
    "googleSearch": {
    }
  }, {
    "codeExecution": {
    }
  }],
  "toolConfig": {
    "includeServerSideToolInvocations": true
  }
}

Tokeny i ceny

Pamiętaj, że części toolCall i toolResponse w żądaniach są wliczane do limitu prompt_token_count. Ponieważ te pośrednie kroki narzędzia są teraz widoczne i zwracane do Ciebie, stanowią część historii rozmowy. Dotyczy to tylko żądań, a nie odpowiedzi.

Wyjątkiem od tej reguły jest narzędzie wyszukiwarki Google. Wyszukiwarka Google stosuje już własny model cenowy na poziomie zapytania, więc tokeny nie są naliczane podwójnie (więcej informacji znajdziesz na stronie Ceny).

Więcej informacji znajdziesz na stronie Tokeny.

Ograniczenia

Domyślnie tryb VALIDATED (tryb AUTO nie jest obsługiwany), gdy włączona jest flaga include_server_side_tool_invocations
Wbudowane narzędzia, takie jak google_search, korzystają z informacji o lokalizacji i bieżącym czasie, więc jeśli system_instruction lub function_declaration.description mają sprzeczne informacje o lokalizacji i czasie, funkcja łączenia narzędzi może nie działać prawidłowo.

Obsługiwane narzędzia

W przypadku narzędzi po stronie serwera (wbudowanych) obowiązuje standardowe przekazywanie kontekstu narzędzia. Wykonywanie kodu to również narzędzie po stronie serwera, ale ma własne wbudowane rozwiązanie do przekazywania kontekstu. Korzystanie z komputera i wywoływanie funkcji to narzędzia po stronie klienta, które mają też wbudowane rozwiązania do przekazywania kontekstu.

Narzędzie	Strona wykonania	Obsługa rozpowszechniania kontekstu
Wyszukiwarka Google	Po stronie serwera	Obsługiwane
Mapy Google	Po stronie serwera	Obsługiwane
Kontekst adresu URL	Po stronie serwera	Obsługiwane
Wyszukiwanie plików	Po stronie serwera	Obsługiwane
Wykonywanie kodu	Po stronie serwera	Obsługiwane (wbudowane, wykorzystuje części `executableCode` i `codeExecutionResult`)
Korzystanie z komputera	Po stronie klienta	Obsługiwane (wbudowane, wykorzystuje części `functionCall` i `functionResponse`)
Funkcje niestandardowe	Po stronie klienta	Obsługiwane (wbudowane, wykorzystuje części `functionCall` i `functionResponse`)

Co dalej?

Dowiedz się więcej o wywoływaniu funkcji w Gemini API.
Poznaj obsługiwane narzędzia: