Questa guida ti aiuta a eseguire la migrazione dall'API generateContent all'API Interactions.
L'API Interactions è l'interfaccia standard per la creazione con Gemini. È ottimizzata per i workflow agentici, la gestione dello stato lato server e le conversazioni multimodali e multi-turno complesse, pur supportando completamente le semplici richieste single-turn senza stato. Sebbene generateContent rimanga completamente supportata, consigliamo l'API Interactions per tutti i nuovi sviluppi.
Perché migrare?
L'API Interactions offre un modo più strutturato e potente per creare con Gemini:
- Gestione della cronologia lato server: flussi multi-turno semplificati tramite
previous_interaction_id. Il server abilita lo stato per impostazione predefinita (store=true), ma puoi attivare il comportamento senza stato impostandostore=false. - Passaggi di esecuzione osservabili: i passaggi digitati semplificano il debug dei flussi complessi e il rendering dell'interfaccia utente per gli eventi intermedi (come pensieri o widget di ricerca).
- Progettata per i workflow agentici: supporto nativo per l'utilizzo, l'orchestrazione e i flussi di ragionamento complessi degli strumenti multi-step tramite passaggi di esecuzione digitati.
- Attività in background e di lunga durata: supporta l'offload di operazioni che richiedono molto tempo, come Deep Think e Deep Research, ai processi in background utilizzando
background=true. - Accesso a nuovi modelli e funzionalità: in futuro, i nuovi modelli oltre alla famiglia principale, insieme a nuove capacità agentiche e strumenti, verranno lanciati esclusivamente sull'API Interactions.
generateContentcontinuerà a essere completamente supportata per i casi d'uso esistenti.
Input/output di base
Questa sezione mostra come eseguire la migrazione di una semplice richiesta di generazione di testo.
Prima (generateContent)
L'API generateContent è senza stato e restituisce la risposta direttamente. La struttura della risposta racchiude l'output in un elenco di candidates, ognuno contenente content con un elenco di parts da analizzare.
Python
from google import genai
client = genai.Client()
response = client.models.generate_content(
model="gemini-2.5-flash", contents="Tell me a joke."
)
print(response.text)
JavaScript
import { GoogleGenAI } from '@google/genai';
const ai = new GoogleGenAI({});
const response = await ai.models.generateContent({
model: "gemini-2.5-flash",
contents: "Tell me a joke.",
});
console.log(response.text);
REST
# Request
curl -X POST "https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent" \
-H "Content-Type: application/json" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-d '{
"contents": [{
"parts": [{
"text": "Tell me a joke."
}]
}]
}'
# Response
{
"candidates": [
{
"content": {
"parts": [
{
"text": "Why did the chicken cross the road? To get to the other side!"
}
],
"role": "model"
},
"finishReason": "STOP",
"index": 0
}
],
"usageMetadata": {
"promptTokenCount": 4,
"candidatesTokenCount": 12,
"totalTokenCount": 16
}
}
Dopo (API Interactions)
L'API Interactions restituisce una risorsa di interazione archiviata con una sequenza temporale steps. Anziché attraversare candidati e parti, esamina l'array steps per trovare il tipo di output desiderato.
Python
from google import genai
client = genai.Client()
# The input can be a simple string shorthand
interaction = client.interactions.create(
model="gemini-3-flash-preview", input="Tell me a joke."
)
# Inspect the steps manually
for step in interaction.steps:
if step.type == "model_output":
print(step.content[0].text)
JavaScript
import { GoogleGenAI } from '@google/genai';
const client = new GoogleGenAI({});
let interaction = await client.interactions.create({
model: 'gemini-3-flash-preview',
input: 'Tell me a joke.'
});
// Manual inspection
const modelStep = interaction.steps.find(s => s.type === 'model_output');
console.log(modelStep.content[0].text);
REST
# Request
curl -X POST "https://generativelanguage.googleapis.com/v1beta2/interactions" \
-H "Content-Type: application/json" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-d '{
"model": "gemini-3-flash-preview",
"input": "Tell me a joke."
}'
# Response
{
"id": "int_123",
"status": "completed",
"steps": [
{
"type": "user_input",
"status": "done",
"content": [
{
"type": "text",
"text": "Tell me a joke."
}
]
},
{
"type": "model_output",
"status": "done",
"content": [
{
"type": "text",
"text": "Why did the chicken cross the road?"
}
]
}
]
}
Conversazioni a più turni
L'API Interactions archivia le interazioni per impostazione predefinita, consentendo la gestione dello stato lato server per le conversazioni multi-turno.
Prima (generateContent)
In generateContent, devi gestire manualmente la cronologia delle conversazioni utilizzando l'array contents o un helper di chat lato client.
Python
Utilizzo dell'helper di chat (opzione consigliata)
from google import genai
client = genai.Client()
chat = client.chats.create(model="gemini-2.5-flash")
response1 = chat.send_message("Hi, my name is Phil.")
print(response1.text)
response2 = chat.send_message("What is my name?")
print(response2.text)
Gestione manuale della cronologia
from google import genai
from google.genai import types
client = genai.Client()
# The second turn requires sending the entire history
response = client.models.generate_content(
model="gemini-2.5-flash",
contents=[
types.Content(
role="user", parts=[types.Part.from_text("Hi, my name is Phil.")]
),
types.Content(
role="model",
parts=[types.Part.from_text("Hi Phil, how can I help you?")],
),
types.Content(
role="user", parts=[types.Part.from_text("What is my name?")]
),
],
)
print(response.text)
JavaScript
Utilizzo dell'helper di chat (opzione consigliata)
import { GoogleGenAI } from '@google/genai';
const client = new GoogleGenAI({});
const chat = client.chats.create({ model: 'gemini-2.5-flash' });
let response = await chat.sendMessage({ message: 'Hi, my name is Phil.' });
console.log(response.text);
response = await chat.sendMessage({ message: 'What is my name?' });
console.log(response.text);
Gestione manuale della cronologia
import { GoogleGenAI } from '@google/genai';
const client = new GoogleGenAI({});
// The second turn requires sending the entire history
const response = await client.models.generateContent({
model: 'gemini-2.5-flash',
contents: [
{ role: 'user', parts: [{ text: 'Hi, my name is Phil.' }] },
{ role: 'model', parts: [{ text: 'Hi Phil, how can I help you?' }] },
{ role: 'user', parts: [{ text: 'What is my name?' }] }
]
});
console.log(response.text);
REST
# Request (the second turn requires sending the entire history)
curl -X POST "https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent" \
-H "Content-Type: application/json" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-d '{
"contents": [
{"role": "user", "parts": [{"text": "Hi, my name is Phil."}]},
{"role": "model", "parts": [{"text": "Hi Phil, how can I help you?"}]},
{"role": "user", "parts": [{"text": "What is my name?"}]}
]
}'
# Response
{
"candidates": [
{
"content": {
"parts": [
{
"text": "Your name is Phil."
}
],
"role": "model"
},
"finishReason": "STOP",
"index": 0
}
]
}
Dopo (API Interactions)
L'API Interactions gestisce lo stato sul server. Per continuare una conversazione, fai riferimento a previous_interaction_id.
Python
from google import genai
client = genai.Client()
# First turn
interaction1 = client.interactions.create(
model="gemini-3-flash-preview", input="Hi, my name is Phil."
)
print(interaction1.steps[-1].content[0].text)
# Second turn (passing previous_interaction_id)
interaction2 = client.interactions.create(
model="gemini-3-flash-preview",
previous_interaction_id=interaction1.id,
input="What is my name?",
)
print(interaction2.steps[-1].content[0].text)
JavaScript
import { GoogleGenAI } from '@google/genai';
const client = new GoogleGenAI({});
// First turn
let interaction = await client.interactions.create({
model: 'gemini-3-flash-preview',
input: 'Hi, my name is Phil.'
});
console.log(interaction.steps.at(-1).content[0].text);
// Second turn
interaction = await client.interactions.create({
model: 'gemini-3-flash-preview',
previous_interaction_id: interaction.id,
input: 'What is my name?'
});
console.log(interaction.steps.at(-1).content[0].text);
REST
# First Request
curl -X POST "https://generativelanguage.googleapis.com/v1beta2/interactions" \
-H "Content-Type: application/json" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-d '{
"model": "gemini-3-flash-preview",
"input": "Hi, my name is Phil."
}'
# Second Request (using ID from first response)
curl -X POST "https://generativelanguage.googleapis.com/v1beta2/interactions" \
-H "Content-Type: application/json" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-d '{
"model": "gemini-3-flash-preview",
"previous_interaction_id": "int_123",
"input": "What is my name?"
}'
# Response to Second Request
{
"id": "int_123",
"steps": [
{
"type": "user_input",
"status": "done",
"content": [{ "type": "text", "text": "Hi, my name is Phil." }]
},
{
"type": "model_output",
"status": "done",
"content": [{ "type": "text", "text": "Hello Phil! How can I help you today?" }]
},
{
"type": "user_input",
"status": "done",
"content": [{ "type": "text", "text": "What is my name?" }]
},
{
"type": "model_output",
"status": "done",
"content": [{ "type": "text", "text": "Your name is Phil." }]
}
]
}
Input multimodali
Entrambe le API supportano input multimodali (testo, immagini, video e così via).
Prima (generateContent)
In generateContent, passi un elenco di parts all'interno dell'array contents. La risposta restituisce l'output nelle parts del primo candidato.
Python
from google import genai
from google.genai import types
client = genai.Client()
with open("sample.jpg", "rb") as f:
image_bytes = f.read()
response = client.models.generate_content(
model="gemini-2.5-flash",
contents=[
types.Part.from_bytes(data=image_bytes, mime_type="image/jpeg"),
"Describe this image.",
],
)
print(response.text)
REST
# Request
curl -X POST "https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent" \
-H "Content-Type: application/json" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-d '{
"contents": [{
"parts": [
{
"inlineData": {
"mimeType": "image/jpeg",
"data": "..."
}
},
{
"text": "Describe this image."
}
]
}]
}'
# Response
{
"candidates": [
{
"content": {
"parts": [
{
"text": "This is a picture of a beautiful sunset."
}
],
"role": "model"
}
}
]
}
Dopo (API Interactions)
Nell'API Interactions, passi un array al campo input. Recupera i contenuti di output trovando il passaggio model_output nella sequenza temporale.
Python
from google import genai
client = genai.Client()
# Assuming you have an image file
with open("sample.jpg", "rb") as f:
image_bytes = f.read()
interaction = client.interactions.create(
model="gemini-3-flash-preview",
input=[
{
"type": "image",
"mime_type": "image/jpeg",
"data": image_bytes,
},
{"type": "text", "text": "Describe this image."},
],
)
for step in interaction.steps:
if step.type == "model_output":
print(step.content[0].text)
JavaScript
import { GoogleGenAI } from '@google/genai';
import * as fs from 'fs';
const client = new GoogleGenAI({});
const imageBytes = fs.readFileSync('sample.jpg').toString('base64');
const interaction = await client.interactions.create({
model: 'gemini-3-flash-preview',
input: [
{
type: 'image',
mime_type: 'image/jpeg',
data: imageBytes
},
{
type: 'text',
text: 'Describe this image.'
}
]
});
for (const step of interaction.steps) {
if (step.type === 'model_output') {
console.log(step.content[0].text);
}
}
REST
# Request
curl -X POST "https://generativelanguage.googleapis.com/v1beta2/interactions" \
-H "Content-Type: application/json" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-d '{
"model": "gemini-3-flash-preview",
"input": [
{
"type": "image",
"mime_type": "image/jpeg",
"data": "..."
},
{
"type": "text",
"text": "Describe this image."
}
]
}'
# Response
{
"id": "int_multimodal",
"steps": [
{
"type": "user_input",
"status": "done",
"content": [
{
"type": "image",
"mime_type": "image/jpeg",
"data": "..."
},
{
"type": "text",
"text": "Describe this image."
}
]
},
{
"type": "model_output",
"status": "done",
"content": [
{
"type": "text",
"text": "This is a picture of a beautiful sunset over the mountains."
}
]
}
]
}
Output strutturato
Per fare in modo che il modello restituisca JSON corrispondente a uno schema specifico, configura il formato della risposta.
Prima (generateContent)
In generateContent, configuri il formato di output utilizzando il campo response_format nidificato all'interno dell'oggetto generationConfig.
Python
from google import genai
from google.genai import types
from pydantic import BaseModel
client = genai.Client()
class Recipe(BaseModel):
recipe_name: str
ingredients: list[str]
response = client.models.generate_content(
model="gemini-2.5-flash",
contents="Give me a recipe for chocolate chip cookies.",
config=types.GenerateContentConfig(
response_format=[
{
"type": "text",
"mime_type": "application/json",
"schema": Recipe,
}
]
),
)
print(response.text)
REST
# Request
curl -X POST "https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent" \
-H "Content-Type: application/json" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-d '{
"contents": [{
"parts": [{
"text": "Give me a recipe for chocolate chip cookies."
}]
}],
"generationConfig": {
"responseFormat": [
{
"type": "text",
"mimeType": "application/json",
"schema": {
"type": "OBJECT",
"properties": {
"recipe_name": { "type": "STRING" },
"ingredients": {
"type": "ARRAY",
"items": { "type": "STRING" }
}
},
"required": ["recipe_name", "ingredients"]
}
}
]
}
}'
# Response
{
"candidates": [
{
"content": {
"parts": [
{
"text": "{\n \"recipe_name\": \"Chocolate Chip Cookies\",\n \"ingredients\": [\n \"1 cup butter\",\n \"1 cup sugar\",\n \"2 cups flour\",\n \"1 cup chocolate chips\"\n ]\n}"
}
],
"role": "model"
}
}
]
}
Dopo (API Interactions)
Nell'API Interactions, i controlli del formato di output vengono spostati in un array response_format di primo livello.
Python
from google import genai
from pydantic import BaseModel
client = genai.Client()
class Recipe(BaseModel):
recipe_name: str
ingredients: list[str]
interaction = client.interactions.create(
model="gemini-3-flash-preview",
input="Give me a recipe for chocolate chip cookies.",
response_format=[
{
"type": "text",
"mime_type": "application/json",
"schema": Recipe,
}
],
)
for step in interaction.steps:
if step.type == "model_output":
print(step.content[0].text)
JavaScript
import { GoogleGenAI } from '@google/genai';
const client = new GoogleGenAI({});
const interaction = await client.interactions.create({
model: 'gemini-3-flash-preview',
input: 'Give me a recipe for chocolate chip cookies.',
response_format: [
{
type: 'text',
mime_type: 'application/json',
schema: {
type: 'object',
properties: {
recipe_name: { type: 'string' },
ingredients: {
type: 'array',
items: { type: 'string' }
}
},
required: ['recipe_name', 'ingredients']
}
}
]
});
for (const step of interaction.steps) {
if (step.type === 'model_output') {
console.log(step.content[0].text);
}
}
REST
# Request
curl -X POST "https://generativelanguage.googleapis.com/v1beta2/interactions" \
-H "Content-Type: application/json" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-d '{
"model": "gemini-3-flash-preview",
"input": "Give me a recipe for chocolate chip cookies.",
"response_format": [
{
"type": "text",
"mime_type": "application/json",
"schema": {
"type": "OBJECT",
"properties": {
"recipe_name": { "type": "STRING" },
"ingredients": {
"type": "ARRAY",
"items": { "type": "STRING" }
}
},
"required": ["recipe_name", "ingredients"]
}
}
]
}'
# Response
{
"id": "int_structured",
"steps": [
{
"type": "user_input",
"status": "done",
"content": [{ "type": "text", "text": "Give me a recipe for chocolate chip cookies." }]
},
{
"type": "model_output",
"status": "done",
"content": [
{
"type": "text",
"text": "{\n \"recipe_name\": \"Chocolate Chip Cookies\",\n \"ingredients\": [\n \"1 cup butter\",\n \"1 cup sugar\",\n \"2 cups flour\",\n \"1 cup chocolate chips\"\n ]\n}"
}
]
}
]
}
Generazione multimodale
Quando generi contenuti in modalità diverse dal testo (ad esempio immagini o audio), la differenza principale è il modo in cui la risposta struttura i contenuti multimediali generati.
Prima (generateContent)
In generateContent, la risposta restituisce i contenuti multimediali generati direttamente nelle parts del candidato, in genere come dati base64 in inlineData.
# Response structure concept
{
"candidates": [
{
"content": {
"parts": [
{
"text": "Here is your generated image:"
},
{
"inlineData": {
"mimeType": "image/jpeg",
"data": "...base64..."
}
}
]
}
}
]
}
Dopo (API Interactions)
Nell'API Interactions, i contenuti multimediali generati vengono visualizzati come elementi distinti all'interno dell'array content di un passaggio model_output nella sequenza temporale, mantenendo il flusso cronologico dell'interazione.
# Response structure concept
{
"id": "int_123",
"steps": [
{
"type": "model_output",
"status": "done",
"content": [
{
"type": "text",
"text": "Here is your generated image:"
},
{
"type": "image",
"mime_type": "image/jpeg",
"data": "...base64..." // Or a reference URL in future
}
]
}
]
}
In questo modo, l'analisi della risposta è coerente con la gestione degli input e degli output di testo: ogni elemento è un passaggio nella sequenza temporale.
Strumenti lato server
Gemini supporta strumenti lato server integrati come la fondatezza della Ricerca Google. La differenza principale è il modo in cui la risposta rappresenta l'esecuzione dello strumento.
Prima (generateContent)
In generateContent, gli strumenti lato server sono in gran parte opachi. Abiliti lo strumento e ottieni una risposta finale con un oggetto groundingMetadata separato. È fondamentale notare che le citazioni non sono in linea; groundingSupports utilizza gli indici dei caratteri per rimappare i segmenti di testo alle fonti web in groundingChunks.
Python
from google import genai
from google.genai import types
client = genai.Client()
response = client.models.generate_content(
model="gemini-2.5-flash",
contents="Who won Euro 2024?",
config=types.GenerateContentConfig(
tools=[{"google_search": {}}]
),
)
# Access search entry point (widget) and citations
metadata = response.candidates[0].grounding_metadata
if metadata.search_entry_point:
print(f"Search Entry Point: {metadata.search_entry_point.rendered_content}")
for support in metadata.grounding_supports:
print(f"Citation: {support.segment.text}")
JavaScript
import { GoogleGenAI } from '@google/genai';
const client = new GoogleGenAI({});
const response = await client.models.generateContent({
model: 'gemini-2.5-flash',
contents: 'Who won Euro 2024?',
config: {
tools: [{ google_search: {} }]
}
});
const metadata = response.candidates[0].groundingMetadata;
if (metadata.searchEntryPoint) {
console.log(`Search Entry Point: ${metadata.searchEntryPoint.renderedContent}`);
}
for (const support of metadata.groundingSupports) {
console.log(`Citation: ${support.segment.text}`);
}
REST
# Request
curl -X POST "https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent" \
-H "Content-Type: application/json" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-d '{
"contents": [{
"parts": [{
"text": "Who won Euro 2024?"
}]
}],
"tools": [{
"googleSearchRetrieval": {}
}]
}'
# Response
{
"candidates": [
{
"content": {
"parts": [
{
"text": "Spain won Euro 2024, defeating England 2-1 in the final. This victory marks Spain's record fourth European Championship title."
}
],
"role": "model"
},
"groundingMetadata": {
"webSearchQueries": [
"UEFA Euro 2024 winner",
"who won euro 2024"
],
"searchEntryPoint": {
"renderedContent": "<!-- HTML and CSS for the search widget -->"
},
"groundingChunks": [
{"web": {"uri": "https://vertexaisearch.cloud.google.com.....", "title": "aljazeera.com"}},
{"web": {"uri": "https://vertexaisearch.cloud.google.com.....", "title": "uefa.com"}}
],
"groundingSupports": [
{
"segment": {"startIndex": 0, "endIndex": 85, "text": "Spain won Euro 2024, defeatin..."},
"groundingChunkIndices": [0]
},
{
"segment": {"startIndex": 86, "endIndex": 210, "text": "This victory marks Spain's..."},
"groundingChunkIndices": [0, 1]
}
]
}
}
]
}
Dopo (API Interactions)
Nell'API Interactions, gli strumenti lato server forniscono una trasparenza completa della sequenza temporale. L'API registra la chiamata e il risultato come steps di esecuzione distinti (google_search_call e google_search_result), esponendo esattamente i dati recuperati dal modello.
Inoltre, l'API restituisce le citazioni in linea. Anziché mappare gli indici da un oggetto di metadati separato, l'elemento di testo all'interno del passaggio model_output contiene il proprio array annotations che rimanda direttamente alla fonte.
Python
from google import genai
client = genai.Client()
interaction = client.interactions.create(
model="gemini-3-flash-preview",
input="Who won Euro 2024?",
tools=[{"type": "google_search"}],
)
for step in interaction.steps:
if step.type == "google_search_result":
print(f"Search Suggestions: {step.search_suggestions}")
elif step.type == "model_output":
print(f"Answer: {step.content[0].text}")
if step.content[0].annotations:
for anno in step.content[0].annotations:
print(f"Citation: {anno.title} ({anno.uri})")
JavaScript
import { GoogleGenAI } from '@google/genai';
const client = new GoogleGenAI({});
const interaction = await client.interactions.create({
model: 'gemini-3-flash-preview',
input: 'Who won Euro 2024?',
tools: [{ type: 'google_search' }]
});
for (const step of interaction.steps) {
if (step.type === 'google_search_result') {
console.log(`Search Suggestions: ${step.search_suggestions}`);
} else if (step.type === 'model_output') {
console.log(`Answer: ${step.content[0].text}`);
if (step.content[0].annotations) {
for (const anno of step.content[0].annotations) {
console.log(`Citation: ${anno.title} (${anno.uri})`);
}
}
}
}
REST
# Request
curl -X POST "https://generativelanguage.googleapis.com/v1beta2/interactions" \
-H "Content-Type: application/json" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-d '{
"model": "gemini-3-flash-preview",
"input": "Who won Euro 2024?",
"tools": [{"type": "google_search"}]
}'
# Response (showing grounding)
{
"id": "int_grounded",
"steps": [
{
"type": "user_input",
"status": "done",
"content": [{ "type": "text", "text": "Who won Euro 2024?" }]
},
{
"type": "google_search_call",
"status": "done",
"content": [{ "type": "text", "text": "UEFA Euro 2024 winner" }]
},
{
"type": "google_search_result",
"status": "done",
"content": [
{
"type": "text",
"text": "Spain won Euro 2024..."
}
]
},
{
"type": "model_output",
"status": "done",
"content": [
{
"type": "text",
"text": "Spain won Euro 2024, defeating England 2-1.",
"annotations": [
{
"start_index": 0,
"end_index": 42,
"uri": "https://vertexaisearch...",
"title": "aljazeera.com"
}
]
}
]
}
]
}
Chiamata di funzione
Anche la struttura delle chiamate di funzione e dei risultati è stata modificata per adattarsi allo schema Steps.
Prima (generateContent)
In generateContent, la risposta restituisce le chiamate di funzione all'interno dei candidati.
Python
from google import genai
from google.genai import types
client = genai.Client()
# Step 1: Send prompt with tools
response = client.models.generate_content(
model="gemini-2.5-flash",
contents="What's the weather in Boston?",
config=types.GenerateContentConfig(tools=[weather_tool]),
)
# Assume model returned function_call
function_call = response.candidates[0].content.parts[0].function_call
print(f"Requested tool: {function_call.name}")
# Step 2: Execute local function and send result back
result = "52°F and rain"
response = client.models.generate_content(
model="gemini-2.5-flash",
contents=[
types.Content(
role="user",
parts=[
types.Part.from_text(text="What's the weather in Boston?")
],
),
response.candidates[0].content, # Model turn with function call
types.Content(
role="user",
parts=[
types.Part.from_function_response(
name=function_call.name,
response={"result": result},
)
],
),
],
config=types.GenerateContentConfig(tools=[weather_tool]),
)
print(response.text)
JavaScript
import { GoogleGenAI } from '@google/genai';
const client = new GoogleGenAI({});
// Step 1: Send prompt with tools
let response = await client.models.generateContent({
model: 'gemini-2.5-flash',
contents: "What's the weather in Boston?",
config: { tools: [weatherTool] }
});
const functionCall = response.candidates[0].content.parts[0].functionCall;
console.log(`Requested tool: ${functionCall.name}`);
// Step 2: Execute local function and send result back
const result = "52°F and rain";
response = await client.models.generateContent({
model: 'gemini-2.5-flash',
contents: [
{ role: 'user', parts: [{ text: "What's the weather in Boston?" }] },
response.candidates[0].content, // Model turn
{
role: 'user',
parts: [{
functionResponse: {
name: functionCall.name,
response: { result: result }
}
}]
}
],
config: { tools: [weatherTool] }
});
console.log(response.text);
REST
# Request
curl -X POST "https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent" \
-H "Content-Type: application/json" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-d '{
"contents": [{
"parts": [{
"text": "What is the weather like in Boston, MA?"
}]
}],
"tools": [{
"functionDeclarations": [{
"name": "get_weather",
"description": "Get the current weather",
"parameters": {
"type": "OBJECT",
"properties": {
"location": {"type": "STRING"}
},
"required": ["location"]
}
}]
}]
}'
# Response
{
"candidates": [
{
"content": {
"parts": [
{
"functionCall": {
"name": "get_weather",
"args": { "location": "Boston, MA" }
}
}
],
"role": "model"
},
"finishReason": "STOP",
"index": 0
}
]
}
Dopo (API Interactions)
Le chiamate e i risultati degli strumenti sono ora passaggi distinti nella sequenza temporale.
Python
from google import genai
client = genai.Client()
weather_tool = {
"type": "function",
"name": "get_weather",
"description": "Gets weather",
"parameters": {
"type": "object",
"properties": {
"location": {"type": "string"}
},
},
}
interaction = client.interactions.create(
model="gemini-3-flash-preview",
input="What's the weather in Boston?",
tools=[weather_tool],
)
# Check if the model requested a tool call
for step in interaction.steps:
if step.type == "function_call":
print(f"Executing {step.name} for {step.arguments}")
# Execute your local function here...
result = "52°F and rain"
# Submit the result back as a step
interaction = client.interactions.create(
model="gemini-3-flash-preview",
previous_interaction_id=interaction.id,
input=[
{
"type": "function_result",
"call_id": step.id,
"name": step.name,
"result": [{"type": "text", "text": result}],
}
],
)
# Inspect steps for final response
for s in interaction.steps:
if s.type == "model_output":
print(s.content[0].text)
JavaScript
import { GoogleGenAI } from '@google/genai';
const client = new GoogleGenAI({});
const weatherTool = {
type: "function",
name: "get_weather",
description: "Get weather for a location",
parameters: {
type: "object",
properties: {
location: { type: "string" }
},
required: ["location"]
}
};
const interaction = await client.interactions.create({
model: 'gemini-3-flash-preview',
input: "What's the weather in Boston?",
tools: [weatherTool]
});
// Check if the model requested a tool call
for (const step of interaction.steps) {
if (step.type === 'function_call') {
console.log(`Executing ${step.name} for ${JSON.stringify(step.arguments)}`);
const result = "52°F and rain";
// Submit the result back as a step
const nextInteraction = await client.interactions.create({
model: 'gemini-3-flash-preview',
previous_interaction_id: interaction.id,
input: [
{
type: 'function_result',
call_id: step.id,
name: step.name,
result: [{ type: 'text', text: result }]
}
]
});
// Inspect steps for final response
for (const s of nextInteraction.steps) {
if (s.type === 'model_output') {
console.log(s.content[0].text);
}
}
}
}
REST
# Initial Request
curl -X POST "https://generativelanguage.googleapis.com/v1beta2/interactions" \
-H "Content-Type: application/json" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-d '{
"model": "gemini-3-flash-preview",
"input": "What's the weather in Boston?",
"tools": [{
"type": "function",
"name": "get_weather",
"description": "Get weather for a location",
"parameters": {
"type": "object",
"properties": {
"location": { "type": "string" }
},
"required": ["location"]
}
}]
}'
# Response (requires action)
{
"id": "int_001",
"status": "requires_action",
"steps": [
{
"type": "user_input",
"status": "done",
"content": [
{ "type": "text", "text": "What's the weather in Boston?" }
]
},
{
"type": "function_call",
"status": "waiting",
"id": "fc_1",
"name": "get_weather",
"arguments": { "location": "Boston, MA" }
}
]
}
# Submit Tool Result Request
curl -X POST "https://generativelanguage.googleapis.com/v1beta2/interactions" \
-H "Content-Type: application/json" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-d '{
"model": "gemini-3-flash-preview",
"previous_interaction_id": "int_001",
"input": {
"type": "function_result",
"call_id": "fc_1",
"name": "get_weather",
"result": [
{ "type": "text", "text": "52°F with rain" }
]
}
}'
# Final Response
{
"id": "int_002",
"status": "completed",
"steps": [
{
"type": "function_result",
"call_id": "fc_1",
"name": "get_weather",
"result": [
{ "type": "text", "text": "52°F with rain" }
]
},
{
"type": "model_output",
"status": "done",
"content": [
{ "type": "text", "text": "It's 52°F with rain in Boston." }
]
}
]
}
Streaming
Una differenza fondamentale nello streaming è che l'API Interactions utilizza lo stesso endpoint con "stream": true nel corpo della richiesta, mentre l'API generateContent richiedeva la chiamata di un endpoint dedicato (:streamGenerateContent).
Inoltre, gli eventi di streaming ora utilizzano tipi specializzati per monitorare il ciclo di vita dell'interazione e monitorare i passaggi di esecuzione lungo la sequenza temporale.
Prima (generateContentStream)
Con generateContent, utilizzi uno stream di blocchi di risposta.
Python
response = client.models.generate_content_stream(
model="gemini-2.5-flash", contents="Tell me a story"
)
for chunk in response:
print(chunk.text, end="")
JavaScript
const responseStream = await client.models.generateContentStream({
model: 'gemini-2.5-flash',
contents: 'Tell me a story',
});
for await (const chunk of responseStream) {
process.stdout.write(chunk.text);
}
REST
# Request
curl -X POST "https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:streamGenerateContent" \
-H "Content-Type: application/json" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-d '{
"contents": [{
"parts": [{
"text": "Tell me a story"
}]
}]
}'
# Response stream
event: content.start
data: {"event_type": "content.start", "index": 0, "content": {"type": "thought"}}
event: content.delta
data: {"event_type": "content.delta", "index": 0, "delta": {"type": "thought_summary", "text": "User wants an explanation."}}
event: content.stop
data: {"event_type": "content.stop", "index": 0}
event: content.start
data: {"event_type": "content.start", "index": 1, "content": {"type": "text"}}
event: content.delta
data: {"event_type": "content.delta", "index": 1, "delta": {"type": "text", "text": "Hello"}}
event: content.stop
data: {"event_type": "content.stop", "index": 1}
Dopo (API Interactions)
Nell'API Interactions, lo streaming utilizza gli eventi inviati dal server (SSE) e tipi delta specializzati per rappresentare i passaggi di esecuzione man mano che si verificano.
Python
from google import genai
client = genai.Client()
stream = client.interactions.create(
model="gemini-3-flash-preview",
input="Tell me a story",
stream=True,
)
for event in stream:
if event.event_type == "step.delta":
if event.delta.type == "text":
print(event.delta.text, end="", flush=True)
elif event.event_type == "interaction.complete":
print(f"\n\n--- Stream Finished ---")
JavaScript
import { GoogleGenAI } from '@google/genai';
const client = new GoogleGenAI({});
const stream = await client.interactions.create({
model: 'gemini-3-flash-preview',
input: 'Tell me a story',
stream: true,
});
for await (const event of stream) {
if (event.event_type === 'step.delta') {
if (event.delta.type === 'text' && 'text' in event.delta) {
process.stdout.write(event.delta.text);
}
} else if (event.event_type === 'interaction.complete') {
console.log('\n\n--- Stream Finished ---');
}
}
REST
# Example SSE stream output event: interaction.created data: {"type": "interaction.created", "interaction": {"id": "int_xyz", "status": "created"}} event: interaction.status_update data: {"type": "interaction.status_update", "status": "in_progress"}} event: step.start data: {"type": "step.start", "index": 0, "step": {"type": "thought"}} event: step.delta data: {"type": "step.delta", "index": 0, "delta": {"type": "thought", "text": "User wants an explanation."}} event: step.stop data: {"type": "step.stop", "index": 0, "status": "done"}} event: step.start data: {"type": "step.start", "index": 1, "step": {"type": "model_output"}} event: step.delta data: {"type": "step.delta", "index": 1, "delta": {"type": "text", "text": "Hello"}} event: step.stop data: {"type": "step.stop", "index": 1, "status": "done"}} event: interaction.complete data: {"type": "interaction.complete", "interaction": {"id": "int_xyz", "status": "completed", "usage": {"prompt_tokens": 10, "completion_tokens": 5, "total_tokens": 15}}} ```
Strumenti di streaming e chiamate di funzione
Il comportamento degli strumenti nello stream è cambiato in modo significativo da generateContent per fornire un controllo e una visibilità più granulari.
Prima (generateContent)
Con generateContent, le chiamate di funzione di streaming sono arrivate complete in un unico blocco. Non era possibile visualizzare gli argomenti generati in tempo reale, quindi il gestore controllava semplicemente la presenza di un oggetto functionCall completo.
Python
from google import genai
from google.genai import types
client = genai.Client()
stream = client.models.generate_content_stream(
model="gemini-2.5-flash",
contents="What's the weather in Boston?",
config=types.GenerateContentConfig(tools=[weather_tool]),
)
for chunk in stream:
# Function calls arrived complete — no partial arguments
if chunk.candidates[0].content.parts[0].function_call:
fc = chunk.candidates[0].content.parts[0].function_call
print(f"Call: {fc.name}({fc.args})")
elif chunk.text:
print(chunk.text, end="")
JavaScript
import { GoogleGenAI } from '@google/genai';
const client = new GoogleGenAI({});
const stream = await client.models.generateContentStream({
model: 'gemini-2.5-flash',
contents: "What's the weather in Boston?",
config: { tools: [weatherTool] }
});
for await (const chunk of stream) {
// Function calls arrived complete — no partial arguments
const part = chunk.candidates[0].content.parts[0];
if (part.functionCall) {
console.log(`Call: ${part.functionCall.name}(${JSON.stringify(part.functionCall.args)})`);
} else if (part.text) {
process.stdout.write(part.text);
}
}
REST
# Request
curl -X POST "https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:streamGenerateContent" \
-H "Content-Type: application/json" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-d '{
"contents": [{"parts": [{"text": "What'\''s the weather in Boston?"}]}],
"tools": [{"functionDeclarations": [{"name": "get_weather", "parameters": {"type": "OBJECT", "properties": {"location": {"type": "STRING"}}}}]}]
}'
# Response stream — function call arrives complete in one chunk
{"candidates": [{"content": {"parts": [{"functionCall": {"name": "get_weather", "args": {"location": "Boston, MA"}}}]}}]}
Dopo (API Interactions)
L'API Interactions trasmette in streaming gli argomenti della chiamata di funzione carattere per carattere come eventi arguments. L'intero ciclo di vita dello strumento (pensiero, chiamata, risultato e output) si svolge come una serie di passaggi distinti.
Python
from google import genai
client = genai.Client()
stream = client.interactions.create(
model="gemini-3-flash-preview",
input="What's the weather in Boston?",
tools=[get_weather_tool],
stream=True,
)
for event in stream:
if event.event_type == "step.start":
if event.step.type == "function_call":
print(f"Calling: {event.step.name}")
elif event.event_type == "step.delta":
if event.delta.type == "arguments":
print(f" args: {event.delta.partial_arguments}")
elif event.delta.type == "text":
print(event.delta.text, end="")
elif event.event_type == "interaction.complete":
print("\n--- Done ---")
JavaScript
import { GoogleGenAI } from '@google/genai';
const client = new GoogleGenAI({});
const stream = await client.interactions.create({
model: 'gemini-3-flash-preview',
input: "What's the weather in Boston?",
tools: [getWeatherTool],
stream: true,
});
for await (const event of stream) {
if (event.event_type === 'step.start') {
if (event.step.type === 'function_call') {
console.log(`Calling: ${event.step.name}`);
}
} else if (event.event_type === 'step.delta') {
if (event.delta.type === 'arguments') {
console.log(` args: ${event.delta.partial_arguments}`);
} else if (event.delta.type === 'text') {
process.stdout.write(event.delta.text);
}
} else if (event.event_type === 'interaction.complete') {
console.log('\n--- Done ---');
}
}
REST
# Request
curl -X POST "https://generativelanguage.googleapis.com/v1beta2/interactions" \
-H "Content-Type: application/json" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-d '{
"model": "gemini-3-flash-preview",
"input": "What'\''s the weather in Boston?",
"tools": [{"type": "function", "name": "get_weather", "parameters": {"type": "object", "properties": {"location": {"type": "string"}}}}],
"stream": true
}'
# Response stream
// Interaction created
event: interaction.created
data: {"type": "interaction.created", "interaction": {"id": "int_xyz", "status": "created"}}
event: interaction.status_update
data: {"type": "interaction.status_update", "status": "in_progress"}
// ── Step 0: Thought ──────────────────────────────────
event: step.start
data: {"type": "step.start", "index": 0, "step": {"type": "thought"}}
event: step.delta
data: {"type": "step.delta", "index": 0, "delta": {"type": "thought", "text": "The user wants weather data for Boston. I'll call the get_weather tool."}}
event: step.stop
data: {"type": "step.stop", "index": 0, "status": "done"}
// ── Step 1: Function Call (arguments streamed) ───────
event: step.start
data: {"type": "step.start", "index": 1, "step": {"type": "function_call", "id": "fc_1", "name": "get_weather"}}
event: step.delta
data: {"type": "step.delta", "index": 1, "delta": {"type": "arguments", "partial_arguments": "{\"location\": \"Boston, MA\"}"}}
event: step.stop
data: {"type": "step.stop", "index": 1, "status": "waiting"}
// The interaction pauses — the model needs the tool result before continuing.
event: interaction.status_update
data: {"type": "interaction.status_update", "status": "requires_action"}
// ── (Client submits the tool result) ──────────────────
// The client calls interactions.create with the function_result as input
// and the previous interaction's ID, then resumes consuming the stream.
event: interaction.status_update
data: {"type": "interaction.status_update", "status": "in_progress"}
// ── Step 2: Function Result (echoed back, no deltas) ─
event: step.start
data: {"type": "step.start", "index": 2, "step": {"type": "function_result", "call_id": "fc_1", "name": "get_weather", "result": [{"type": "text", "text": "52°F, rain"}]}}
event: step.stop
data: {"type": "step.stop", "index": 2, "status": "done"}
// ── Step 3: Thought ──────────────────────────────────
event: step.start
data: {"type": "step.start", "index": 3, "step": {"type": "thought"}}
event: step.delta
data: {"type": "step.delta", "index": 3, "delta": {"type": "thought", "text": "Got weather data. Composing the final response."}}
event: step.stop
data: {"type": "step.stop", "index": 3, "status": "done"}
// ── Step 4: Model Output (text streamed) ─────────────
event: step.start
data: {"type": "step.start", "index": 4, "step": {"type": "model_output"}}
event: step.delta
data: {"type": "step.delta", "index": 4, "delta": {"type": "text", "text": "It's currently 52°F and rainy in Boston."}}
event: step.stop
data: {"type": "step.stop", "index": 4, "status": "done"}
// ── Interaction complete ─────────────────────────────
event: interaction.complete
data: {"type": "interaction.complete", "interaction": {"id": "int_xyz", "status": "completed", "usage": {"prompt_tokens": 256, "completion_tokens": 128, "total_tokens": 384}}}
Strumenti lato server nello stream
Gli strumenti lato server come la Ricerca Google si comportano in modo diverso dalle chiamate di funzione nello stream. La chiamata e il risultato arrivano completi nell'evento step.start senza delta, solo step.start seguito immediatamente da step.stop:
// Server-side tool call — payload arrives complete in step.start
event: step.start
data: {"type": "step.start", "index": 4, "step": {"type": "google_search_call", "id": "gs_2", "query": "Alphabet Q4 2025 earnings"}}
event: step.stop
data: {"type": "step.stop", "index": 4, "status": "done"}
// Server-side tool result — also complete in step.start
event: step.start
data: {"type": "step.start", "index": 5, "step": {"type": "google_search_result", "call_id": "gs_2", "rendered_content": "<div>Alphabet Q4 2025 Revenue: $105.6B</div>", "signature": "abc123..."}}
event: step.stop
data: {"type": "step.stop", "index": 5, "status": "done"}