Questa guida ti aiuta a eseguire la migrazione dall'API generateContent all'API Interactions.
L'API Interactions è l'interfaccia standard per la creazione con Gemini. È ottimizzato per i workflow agentici, la gestione dello stato lato server e le conversazioni multimodali e multi-turn complesse, pur supportando completamente le semplici richieste stateless a un solo turno. Sebbene generateContent rimanga completamente supportata, consigliamo l'API Interactions per tutti i nuovi sviluppi.
Perché eseguire la migrazione?
L'API Interactions offre un modo più strutturato e potente per creare con Gemini:
- Gestione della cronologia lato server: flussi multi-turn semplificati tramite
previous_interaction_id. Il server abilita lo stato per impostazione predefinita (store=true), ma puoi attivare il comportamento senza stato impostandostore=false. - Passaggi di esecuzione osservabili: i passaggi digitati semplificano il debug di flussi complessi e il rendering dell'interfaccia utente per eventi intermedi (come pensieri o widget di ricerca).
- Creato per i workflow agentici: supporto nativo per l'utilizzo di strumenti multi-step, l'orchestrazione e flussi di ragionamento complessi tramite passaggi di esecuzione digitati.
- Attività in background e a lunga esecuzione: supporta l'offload di operazioni che richiedono molto tempo, come Deep Think e Deep Research, a processi in background utilizzando
background=true.
Input/output di base
Questa sezione mostra come eseguire la migrazione di una semplice richiesta di generazione di testo.
Prima di (generateContent)
L'API generateContent è stateless e restituisce direttamente la risposta. La struttura della risposta racchiude l'output in un elenco di candidates, ognuno contenente content con un elenco di parts da analizzare.
Python
from google import genai
client = genai.Client()
response = client.models.generate_content(
model="gemini-2.5-flash", contents="Tell me a joke."
)
print(response.text)
JavaScript
import { GoogleGenAI } from '@google/genai';
const ai = new GoogleGenAI({});
const response = await ai.models.generateContent({
model: "gemini-2.5-flash",
contents: "Tell me a joke.",
});
console.log(response.text);
REST
# Request
curl -X POST "https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent" \
-H "Content-Type: application/json" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-d '{
"contents": [{
"parts": [{
"text": "Tell me a joke."
}]
}]
}'
# Response
{
"candidates": [
{
"content": {
"parts": [
{
"text": "Why did the chicken cross the road? To get to the other side!"
}
],
"role": "model"
},
"finishReason": "STOP",
"index": 0
}
],
"usageMetadata": {
"promptTokenCount": 4,
"candidatesTokenCount": 12,
"totalTokenCount": 16
}
}
L'API Interactions restituisce una risorsa di interazione archiviata con una cronologia steps. Sebbene sia possibile esaminare manualmente l'array steps per trovare gli eventi intermedi, gli SDK Google GenAI forniscono proprietà di convenienza direttamente sull'oggetto Interaction restituito per accedere all'output finale.
La proprietà di convenienza più comune è .output_text (stringa), che estrae e unisce automaticamente i blocchi TextContent consecutivi alla fine della risposta del modello. Sebbene questa soluzione funzioni perfettamente per le risposte semplici,
non include i blocchi di testo precedenti separati da contenuti non testuali (come
pensieri, immagini, audio o chiamate di strumenti). Per le risposte multimodali complesse o intercalate, devi eseguire l'iterazione manualmente su steps.
Python
from google import genai
client = genai.Client()
interaction = client.interactions.create(
model="gemini-3.5-flash", input="Tell me a joke."
)
print(interaction.output_text)
JavaScript
import { GoogleGenAI } from '@google/genai';
const client = new GoogleGenAI({});
let interaction = await client.interactions.create({
model: 'gemini-3.5-flash',
input: 'Tell me a joke.'
});
console.log(interaction.output_text);
REST
# Request
curl -X POST "https://generativelanguage.googleapis.com/v1beta2/interactions" \
-H "Content-Type: application/json" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-d '{
"model": "gemini-3.5-flash",
"input": "Tell me a joke."
}'
# Response
{
"id": "int_123",
"status": "completed",
"steps": [
{
"type": "user_input",
"status": "done",
"content": [
{
"type": "text",
"text": "Tell me a joke."
}
]
},
{
"type": "model_output",
"status": "done",
"content": [
{
"type": "text",
"text": "Why did the chicken cross the road?"
}
]
}
]
}
Conversazioni a più turni
L'API Interactions archivia le interazioni per impostazione predefinita, consentendo la gestione dello stato lato server per le conversazioni multi-turno.
Prima di (generateContent)
In generateContent, devi gestire manualmente la cronologia delle conversazioni utilizzando l'array contents o un helper di chat lato client.
Python
Utilizzare l'assistente della chat (opzione consigliata)
from google import genai
client = genai.Client()
chat = client.chats.create(model="gemini-2.5-flash")
response1 = chat.send_message("Hi, my name is Phil.")
print(response1.text)
response2 = chat.send_message("What is my name?")
print(response2.text)
Gestione manuale della cronologia
from google import genai
from google.genai import types
client = genai.Client()
response = client.models.generate_content(
model="gemini-2.5-flash",
contents=[
types.Content(
role="user", parts=[types.Part.from_text("Hi, my name is Phil.")]
),
types.Content(
role="model",
parts=[types.Part.from_text("Hi Phil, how can I help you?")],
),
types.Content(
role="user", parts=[types.Part.from_text("What is my name?")]
),
],
)
print(response.text)
JavaScript
Utilizzare l'assistente della chat (opzione consigliata)
import { GoogleGenAI } from '@google/genai';
const client = new GoogleGenAI({});
const chat = client.chats.create({ model: 'gemini-2.5-flash' });
let response = await chat.sendMessage({ message: 'Hi, my name is Phil.' });
console.log(response.text);
response = await chat.sendMessage({ message: 'What is my name?' });
console.log(response.text);
Gestione manuale della cronologia
import { GoogleGenAI } from '@google/genai';
const client = new GoogleGenAI({});
const response = await client.models.generateContent({
model: 'gemini-2.5-flash',
contents: [
{ role: 'user', parts: [{ text: 'Hi, my name is Phil.' }] },
{ role: 'model', parts: [{ text: 'Hi Phil, how can I help you?' }] },
{ role: 'user', parts: [{ text: 'What is my name?' }] }
]
});
console.log(response.text);
REST
# Request (the second turn requires sending the entire history)
curl -X POST "https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent" \
-H "Content-Type: application/json" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-d '{
"contents": [
{"role": "user", "parts": [{"text": "Hi, my name is Phil."}]},
{"role": "model", "parts": [{"text": "Hi Phil, how can I help you?"}]},
{"role": "user", "parts": [{"text": "What is my name?"}]}
]
}'
# Response
{
"candidates": [
{
"content": {
"parts": [
{
"text": "Your name is Phil."
}
],
"role": "model"
},
"finishReason": "STOP",
"index": 0
}
]
}
After (API Interactions)
L'API Interactions gestisce lo stato sul server. Continui una conversazione facendo riferimento a previous_interaction_id.
Python
from google import genai
client = genai.Client()
interaction1 = client.interactions.create(
model="gemini-3.5-flash", input="Hi, my name is Phil."
)
print("Response 1:", interaction1.output_text)
interaction2 = client.interactions.create(
model="gemini-3.5-flash",
previous_interaction_id=interaction1.id,
input="What is my name?",
)
print("Response 2:", interaction2.output_text)
JavaScript
import { GoogleGenAI } from '@google/genai';
const client = new GoogleGenAI({});
let interaction = await client.interactions.create({
model: 'gemini-3.5-flash',
input: 'Hi, my name is Phil.'
});
console.log("Response 1:", interaction.output_text);
interaction = await client.interactions.create({
model: 'gemini-3.5-flash',
previous_interaction_id: interaction.id,
input: 'What is my name?'
});
console.log("Response 2:", interaction.output_text);
REST
# First Request
curl -X POST "https://generativelanguage.googleapis.com/v1beta2/interactions" \
-H "Content-Type: application/json" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-d '{
"model": "gemini-3.5-flash",
"input": "Hi, my name is Phil."
}'
# Second Request (using ID from first response)
curl -X POST "https://generativelanguage.googleapis.com/v1beta2/interactions" \
-H "Content-Type: application/json" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-d '{
"model": "gemini-3.5-flash",
"previous_interaction_id": "int_123",
"input": "What is my name?"
}'
# Response to Second Request
{
"id": "int_123",
"steps": [
{
"type": "user_input",
"status": "done",
"content": [{ "type": "text", "text": "Hi, my name is Phil." }]
},
{
"type": "model_output",
"status": "done",
"content": [{ "type": "text", "text": "Hello Phil! How can I help you today?" }]
},
{
"type": "user_input",
"status": "done",
"content": [{ "type": "text", "text": "What is my name?" }]
},
{
"type": "model_output",
"status": "done",
"content": [{ "type": "text", "text": "Your name is Phil." }]
}
]
}
Input multimodali
Entrambe le API supportano input multimodali (testo, immagini, video e così via).
Prima di (generateContent)
In generateContent, trasmetti un elenco di parts all'interno dell'array contents. La risposta restituisce l'output in parts del primo candidato.
Python
from google import genai
from google.genai import types
client = genai.Client()
with open("sample.jpg", "rb") as f:
image_bytes = f.read()
response = client.models.generate_content(
model="gemini-2.5-flash",
contents=[
types.Part.from_bytes(data=image_bytes, mime_type="image/jpeg"),
"Describe this image.",
],
)
print(response.text)
REST
# Request
curl -X POST "https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent" \
-H "Content-Type: application/json" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-d '{
"contents": [{
"parts": [
{
"inlineData": {
"mimeType": "image/jpeg",
"data": "..."
}
},
{
"text": "Describe this image."
}
]
}]
}'
# Response
{
"candidates": [
{
"content": {
"parts": [
{
"text": "This is a picture of a beautiful sunset."
}
],
"role": "model"
}
}
]
}
After (API Interactions)
Nell'API Interactions, passi un array al campo input. Per recuperare i contenuti di output, trova il passaggio model_output nella cronologia.
Python
from google import genai
client = genai.Client()
with open("sample.jpg", "rb") as f:
image_bytes = f.read()
interaction = client.interactions.create(
model="gemini-3.5-flash",
input=[
{
"type": "image",
"mime_type": "image/jpeg",
"data": image_bytes,
},
{"type": "text", "text": "Describe this image."},
],
)
print(interaction.output_text)
JavaScript
import { GoogleGenAI } from '@google/genai';
import * as fs from 'fs';
const client = new GoogleGenAI({});
const imageBytes = fs.readFileSync('sample.jpg').toString('base64');
const interaction = await client.interactions.create({
model: 'gemini-3.5-flash',
input: [
{
type: 'image',
mime_type: 'image/jpeg',
data: imageBytes
},
{
type: 'text',
text: 'Describe this image.'
}
]
});
console.log(interaction.output_text);
REST
# Request
curl -X POST "https://generativelanguage.googleapis.com/v1beta2/interactions" \
-H "Content-Type: application/json" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-d '{
"model": "gemini-3.5-flash",
"input": [
{
"type": "image",
"mime_type": "image/jpeg",
"data": "..."
},
{
"type": "text",
"text": "Describe this image."
}
]
}'
# Response
{
"id": "int_multimodal",
"steps": [
{
"type": "user_input",
"status": "done",
"content": [
{
"type": "image",
"mime_type": "image/jpeg",
"data": "..."
},
{
"type": "text",
"text": "Describe this image."
}
]
},
{
"type": "model_output",
"status": "done",
"content": [
{
"type": "text",
"text": "This is a picture of a beautiful sunset over the mountains."
}
]
}
]
}
Output strutturato
Per fare in modo che il modello restituisca un JSON corrispondente a uno schema specifico, configura il formato della risposta.
Prima di (generateContent)
In generateContent, configura il formato di output utilizzando il campo response_format nidificato all'interno dell'oggetto generationConfig.
Python
from google import genai
from google.genai import types
from pydantic import BaseModel
client = genai.Client()
class Recipe(BaseModel):
recipe_name: str
ingredients: list[str]
response = client.models.generate_content(
model="gemini-2.5-flash",
contents="Give me a recipe for chocolate chip cookies.",
config=types.GenerateContentConfig(
response_format=[
{
"type": "text",
"mime_type": "application/json",
"schema": Recipe,
}
]
),
)
print(response.text)
REST
# Request
curl -X POST "https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent" \
-H "Content-Type: application/json" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-d '{
"contents": [{
"parts": [{
"text": "Give me a recipe for chocolate chip cookies."
}]
}],
"generationConfig": {
"responseFormat": [
{
"type": "text",
"mimeType": "application/json",
"schema": {
"type": "OBJECT",
"properties": {
"recipe_name": { "type": "STRING" },
"ingredients": {
"type": "ARRAY",
"items": { "type": "STRING" }
}
},
"required": ["recipe_name", "ingredients"]
}
}
]
}
}'
# Response
{
"candidates": [
{
"content": {
"parts": [
{
"text": "{\n \"recipe_name\": \"Chocolate Chip Cookies\",\n \"ingredients\": [\n \"1 cup butter\",\n \"1 cup sugar\",\n \"2 cups flour\",\n \"1 cup chocolate chips\"\n ]\n}"
}
],
"role": "model"
}
}
]
}
After (API Interactions)
Nell'API Interactions, i controlli del formato di output vengono spostati in un array response_format di primo livello.
Python
from google import genai
from pydantic import BaseModel
client = genai.Client()
class Recipe(BaseModel):
recipe_name: str
ingredients: list[str]
interaction = client.interactions.create(
model="gemini-3.5-flash",
input="Give me a recipe for chocolate chip cookies.",
response_format=[
{
"type": "text",
"mime_type": "application/json",
"schema": Recipe,
}
],
)
print(interaction.output_text)
JavaScript
import { GoogleGenAI } from '@google/genai';
const client = new GoogleGenAI({});
const interaction = await client.interactions.create({
model: 'gemini-3.5-flash',
input: 'Give me a recipe for chocolate chip cookies.',
response_format: [
{
type: 'text',
mime_type: 'application/json',
schema: {
type: 'object',
properties: {
recipe_name: { type: 'string' },
ingredients: {
type: 'array',
items: { type: 'string' }
}
},
required: ['recipe_name', 'ingredients']
}
}
]
});
console.log(interaction.output_text);
REST
# Request
curl -X POST "https://generativelanguage.googleapis.com/v1beta2/interactions" \
-H "Content-Type: application/json" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-d '{
"model": "gemini-3.5-flash",
"input": "Give me a recipe for chocolate chip cookies.",
"response_format": [
{
"type": "text",
"mime_type": "application/json",
"schema": {
"type": "OBJECT",
"properties": {
"recipe_name": { "type": "STRING" },
"ingredients": {
"type": "ARRAY",
"items": { "type": "STRING" }
}
},
"required": ["recipe_name", "ingredients"]
}
}
]
}'
# Response
{
"id": "int_structured",
"steps": [
{
"type": "user_input",
"status": "done",
"content": [{ "type": "text", "text": "Give me a recipe for chocolate chip cookies." }]
},
{
"type": "model_output",
"status": "done",
"content": [
{
"type": "text",
"text": "{\n \"recipe_name\": \"Chocolate Chip Cookies\",\n \"ingredients\": [\n \"1 cup butter\",\n \"1 cup sugar\",\n \"2 cups flour\",\n \"1 cup chocolate chips\"\n ]\n}"
}
]
}
]
}
Generazione multimodale
Quando si generano contenuti in modalità diverse dal testo (come immagini o audio), la differenza principale è il modo in cui la risposta struttura i contenuti multimediali generati.
Prima di (generateContent)
In generateContent, la risposta restituisce i contenuti multimediali generati direttamente in parts del candidato, in genere come dati base64 in inlineData.
# Response structure concept
{
"candidates": [
{
"content": {
"parts": [
{
"text": "Here is your generated image:"
},
{
"inlineData": {
"mimeType": "image/jpeg",
"data": "...base64..."
}
}
]
}
}
]
}
After (API Interactions)
Nell'API Interactions, i contenuti multimediali generati vengono visualizzati come elementi distinti all'interno dell'array content di un passaggio model_output nella sequenza temporale, mantenendo il flusso cronologico dell'interazione.
# Response structure concept
{
"id": "int_123",
"steps": [
{
"type": "model_output",
"status": "done",
"content": [
{
"type": "text",
"text": "Here is your generated image:"
},
{
"type": "image",
"mime_type": "image/jpeg",
"data": "...base64..." // Or a reference URL in future
}
]
}
]
}
In questo modo, l'analisi della risposta è coerente con la gestione degli input e degli output di testo: ogni elemento è un passaggio della sequenza temporale.
Strumenti lato server
Gemini supporta strumenti lato server integrati come la fondatezza della Ricerca Google. La differenza principale è il modo in cui la risposta rappresenta l'esecuzione dello strumento.
Prima di (generateContent)
In generateContent, gli strumenti lato server sono in gran parte opachi. Attivi lo strumento e ricevi una risposta finale con un oggetto groundingMetadata separato. Fondamentalmente, le citazioni non sono in linea; groundingSupports utilizzano gli indici dei caratteri per mappare i segmenti di testo alle fonti web in groundingChunks.
Python
from google import genai
from google.genai import types
client = genai.Client()
response = client.models.generate_content(
model="gemini-2.5-flash",
contents="Who won Euro 2024?",
config=types.GenerateContentConfig(
tools=[{"google_search": {}}]
),
)
metadata = response.candidates[0].grounding_metadata
if metadata.search_entry_point:
print(f"Search Entry Point: {metadata.search_entry_point.rendered_content}")
for support in metadata.grounding_supports:
print(f"Citation: {support.segment.text}")
JavaScript
import { GoogleGenAI } from '@google/genai';
const client = new GoogleGenAI({});
const response = await client.models.generateContent({
model: 'gemini-2.5-flash',
contents: 'Who won Euro 2024?',
config: {
tools: [{ google_search: {} }]
}
});
const metadata = response.candidates[0].groundingMetadata;
if (metadata.searchEntryPoint) {
console.log(`Search Entry Point: ${metadata.searchEntryPoint.renderedContent}`);
}
for (const support of metadata.groundingSupports) {
console.log(`Citation: ${support.segment.text}`);
}
REST
# Request
curl -X POST "https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent" \
-H "Content-Type: application/json" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-d '{
"contents": [{
"parts": [{
"text": "Who won Euro 2024?"
}]
}],
"tools": [{
"googleSearchRetrieval": {}
}]
}'
# Response
{
"candidates": [
{
"content": {
"parts": [
{
"text": "Spain won Euro 2024, defeating England 2-1 in the final. This victory marks Spain's record fourth European Championship title."
}
],
"role": "model"
},
"groundingMetadata": {
"webSearchQueries": [
"UEFA Euro 2024 winner",
"who won euro 2024"
],
"searchEntryPoint": {
"renderedContent": "<!-- HTML and CSS for the search widget -->"
},
"groundingChunks": [
{"web": {"uri": "https://vertexaisearch.cloud.google.com.....", "title": "aljazeera.com"}},
{"web": {"uri": "https://vertexaisearch.cloud.google.com.....", "title": "uefa.com"}}
],
"groundingSupports": [
{
"segment": {"startIndex": 0, "endIndex": 85, "text": "Spain won Euro 2024, defeatin..."},
"groundingChunkIndices": [0]
},
{
"segment": {"startIndex": 86, "endIndex": 210, "text": "This victory marks Spain's..."},
"groundingChunkIndices": [0, 1]
}
]
}
}
]
}
After (API Interactions)
Nell'API Interactions, gli strumenti lato server forniscono una trasparenza completa della cronologia. L'API registra la chiamata e il risultato come esecuzioni distinte steps (google_search_call e google_search_result), mostrando esattamente i dati recuperati dal modello.
Inoltre, l'API restituisce le citazioni in linea. Anziché mappare gli indici da un oggetto di metadati separato, l'elemento di testo all'interno del passaggio model_output contiene il proprio array annotations che rimanda direttamente all'origine.
Python
from google import genai
client = genai.Client()
interaction = client.interactions.create(
model="gemini-3.5-flash",
input="Who won Euro 2024?",
tools=[{"type": "google_search"}],
)
for step in interaction.steps:
if step.type == "google_search_result":
print(f"Search Suggestions: {step.search_suggestions}")
elif step.type == "model_output":
print(f"Answer: {step.content[0].text}")
if step.content[0].annotations:
for anno in step.content[0].annotations:
print(f"Citation: {anno.title} ({anno.uri})")
JavaScript
import { GoogleGenAI } from '@google/genai';
const client = new GoogleGenAI({});
const interaction = await client.interactions.create({
model: 'gemini-3.5-flash',
input: 'Who won Euro 2024?',
tools: [{ type: 'google_search' }]
});
for (const step of interaction.steps) {
if (step.type === 'google_search_result') {
console.log(`Search Suggestions: ${step.search_suggestions}`);
} else if (step.type === 'model_output') {
console.log(`Answer: ${step.content[0].text}`);
if (step.content[0].annotations) {
for (const anno of step.content[0].annotations) {
console.log(`Citation: ${anno.title} (${anno.uri})`);
}
}
}
}
REST
# Request
curl -X POST "https://generativelanguage.googleapis.com/v1beta2/interactions" \
-H "Content-Type: application/json" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-d '{
"model": "gemini-3.5-flash",
"input": "Who won Euro 2024?",
"tools": [{"type": "google_search"}]
}'
# Response (showing grounding)
{
"id": "int_grounded",
"steps": [
{
"type": "user_input",
"status": "done",
"content": [{ "type": "text", "text": "Who won Euro 2024?" }]
},
{
"type": "google_search_call",
"status": "done",
"content": [{ "type": "text", "text": "UEFA Euro 2024 winner" }]
},
{
"type": "google_search_result",
"status": "done",
"content": [
{
"type": "text",
"text": "Spain won Euro 2024..."
}
]
},
{
"type": "model_output",
"status": "done",
"content": [
{
"type": "text",
"text": "Spain won Euro 2024, defeating England 2-1.",
"annotations": [
{
"start_index": 0,
"end_index": 42,
"uri": "https://vertexaisearch...",
"title": "aljazeera.com"
}
]
}
]
}
]
}
Chiamata di funzione
Anche la struttura delle chiamate di funzioni e dei risultati è cambiata per adattarsi allo schema Passi.
Prima di (generateContent)
In generateContent, la risposta restituisce le chiamate di funzioni all'interno dei candidati.
Python
from google import genai
from google.genai import types
client = genai.Client()
response = client.models.generate_content(
model="gemini-2.5-flash",
contents="What's the weather in Boston?",
config=types.GenerateContentConfig(tools=[weather_tool]),
)
function_call = response.candidates[0].content.parts[0].function_call
print(f"Requested tool: {function_call.name}")
result = "52°F and rain"
response = client.models.generate_content(
model="gemini-2.5-flash",
contents=[
types.Content(
role="user",
parts=[
types.Part.from_text(text="What's the weather in Boston?")
],
),
response.candidates[0].content,
types.Content(
role="user",
parts=[
types.Part.from_function_response(
name=function_call.name,
response={"result": result},
)
],
),
],
config=types.GenerateContentConfig(tools=[weather_tool]),
)
print(response.text)
JavaScript
import { GoogleGenAI } from '@google/genai';
const client = new GoogleGenAI({});
let response = await client.models.generateContent({
model: 'gemini-2.5-flash',
contents: "What's the weather in Boston?",
config: { tools: [weatherTool] }
});
const functionCall = response.candidates[0].content.parts[0].functionCall;
console.log(`Requested tool: ${functionCall.name}`);
const result = "52°F and rain";
response = await client.models.generateContent({
model: 'gemini-2.5-flash',
contents: [
{ role: 'user', parts: [{ text: "What's the weather in Boston?" }] },
response.candidates[0].content,
{
role: 'user',
parts: [{
functionResponse: {
name: functionCall.name,
response: { result: result }
}
}]
}
],
config: { tools: [weatherTool] }
});
console.log(response.text);
REST
# Request
curl -X POST "https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent" \
-H "Content-Type: application/json" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-d '{
"contents": [{
"parts": [{
"text": "What is the weather like in Boston, MA?"
}]
}],
"tools": [{
"functionDeclarations": [{
"name": "get_weather",
"description": "Get the current weather",
"parameters": {
"type": "OBJECT",
"properties": {
"location": {"type": "STRING"}
},
"required": ["location"]
}
}]
}]
}'
# Response
{
"candidates": [
{
"content": {
"parts": [
{
"functionCall": {
"name": "get_weather",
"args": { "location": "Boston, MA" }
}
}
],
"role": "model"
},
"finishReason": "STOP",
"index": 0
}
]
}
After (API Interactions)
Le chiamate agli strumenti e i risultati sono ora passaggi distinti nella sequenza temporale.
Python
from google import genai
client = genai.Client()
weather_tool = {
"type": "function",
"name": "get_weather",
"description": "Gets weather",
"parameters": {
"type": "object",
"properties": {
"location": {"type": "string"}
},
},
}
interaction = client.interactions.create(
model="gemini-3.5-flash",
input="What's the weather in Boston?",
tools=[weather_tool],
)
for step in interaction.steps:
if step.type == "function_call":
print(f"Executing {step.name} for {step.arguments}")
result = "52°F and rain"
interaction = client.interactions.create(
model="gemini-3.5-flash",
previous_interaction_id=interaction.id,
input=[
{
"type": "function_result",
"call_id": step.id,
"name": step.name,
"result": [{"type": "text", "text": result}],
}
],
)
print(next_interaction.output_text)
JavaScript
import { GoogleGenAI } from '@google/genai';
const client = new GoogleGenAI({});
const weatherTool = {
type: "function",
name: "get_weather",
description: "Get weather for a location",
parameters: {
type: "object",
properties: {
location: { type: "string" }
},
required: ["location"]
}
};
const interaction = await client.interactions.create({
model: 'gemini-3.5-flash',
input: "What's the weather in Boston?",
tools: [weatherTool]
});
for (const step of interaction.steps) {
if (step.type === 'function_call') {
console.log(`Executing ${step.name} for ${JSON.stringify(step.arguments)}`);
const result = "52°F and rain";
const nextInteraction = await client.interactions.create({
model: 'gemini-3.5-flash',
previous_interaction_id: interaction.id,
input: [
{
type: 'function_result',
call_id: step.id,
name: step.name,
result: [{ type: 'text', text: result }]
}
]
});
console.log(nextInteraction.output_text);
}
}
REST
# Initial Request
curl -X POST "https://generativelanguage.googleapis.com/v1beta2/interactions" \
-H "Content-Type: application/json" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-d '{
"model": "gemini-3.5-flash",
"input": "What's the weather in Boston?",
"tools": [{
"type": "function",
"name": "get_weather",
"description": "Get weather for a location",
"parameters": {
"type": "object",
"properties": {
"location": { "type": "string" }
},
"required": ["location"]
}
}]
}'
# Response (requires action)
{
"id": "int_001",
"status": "requires_action",
"steps": [
{
"type": "user_input",
"status": "done",
"content": [
{ "type": "text", "text": "What's the weather in Boston?" }
]
},
{
"type": "function_call",
"status": "waiting",
"id": "fc_1",
"name": "get_weather",
"arguments": { "location": "Boston, MA" }
}
]
}
# Submit Tool Result Request
curl -X POST "https://generativelanguage.googleapis.com/v1beta2/interactions" \
-H "Content-Type: application/json" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-d '{
"model": "gemini-3.5-flash",
"previous_interaction_id": "int_001",
"input": {
"type": "function_result",
"call_id": "fc_1",
"name": "get_weather",
"result": [
{ "type": "text", "text": "52°F with rain" }
]
}
}'
# Final Response
{
"id": "int_002",
"status": "completed",
"steps": [
{
"type": "function_result",
"call_id": "fc_1",
"name": "get_weather",
"result": [
{ "type": "text", "text": "52°F with rain" }
]
},
{
"type": "model_output",
"status": "done",
"content": [
{ "type": "text", "text": "It's 52°F with rain in Boston." }
]
}
]
}
Streaming
Una differenza fondamentale nello streaming è che l'API Interactions utilizza lo stesso endpoint con "stream": true nel corpo della richiesta, mentre l'API generateContent richiedeva la chiamata di un endpoint dedicato (:streamGenerateContent).
Inoltre, gli eventi di streaming ora utilizzano tipi specializzati per monitorare il ciclo di vita dell'interazione e tenere traccia dei passaggi di esecuzione lungo la cronologia.
Prima di (generateContentStream)
Con generateContent, consumi un flusso di blocchi di risposte.
Python
response = client.models.generate_content_stream(
model="gemini-2.5-flash", contents="Tell me a story"
)
for chunk in response:
print(chunk.text, end="")
JavaScript
const responseStream = await client.models.generateContentStream({
model: 'gemini-2.5-flash',
contents: 'Tell me a story',
});
for await (const chunk of responseStream) {
process.stdout.write(chunk.text);
}
REST
# Request
curl -X POST "https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:streamGenerateContent" \
-H "Content-Type: application/json" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-d '{
"contents": [{
"parts": [{
"text": "Tell me a story"
}]
}]
}'
# Response stream
event: content.start
data: {"event_type": "content.start", "index": 0, "content": {"type": "thought"}}
event: content.delta
data: {"event_type": "content.delta", "index": 0, "delta": {"type": "thought_summary", "text": "User wants an explanation."}}
event: content.stop
data: {"event_type": "content.stop", "index": 0}
event: content.start
data: {"event_type": "content.start", "index": 1, "content": {"type": "text"}}
event: content.delta
data: {"event_type": "content.delta", "index": 1, "delta": {"type": "text", "text": "Hello"}}
event: content.stop
data: {"event_type": "content.stop", "index": 1}
After (API Interactions)
Nell'API Interactions, lo streaming utilizza Server-Sent Events (SSE) e tipi di delta specializzati per rappresentare i passaggi di esecuzione man mano che si verificano.
Python
from google import genai
client = genai.Client()
stream = client.interactions.create(
model="gemini-3.5-flash",
input="Tell me a story",
stream=True,
)
for event in stream:
if event.event_type == "step.delta":
if event.delta.type == "text":
print(event.delta.text, end="", flush=True)
elif event.event_type == "interaction.completed":
print(f"\n\n--- Stream Finished ---")
JavaScript
import { GoogleGenAI } from '@google/genai';
const client = new GoogleGenAI({});
const stream = await client.interactions.create({
model: 'gemini-3.5-flash',
input: 'Tell me a story',
stream: true,
});
for await (const event of stream) {
if (event.event_type === 'step.delta') {
if (event.delta.type === 'text' && 'text' in event.delta) {
process.stdout.write(event.delta.text);
}
} else if (event.event_type === 'interaction.completed') {
console.log('\n\n--- Stream Finished ---');
}
}
REST
# Example SSE stream output event: interaction.created data: {"type": "interaction.created", "interaction": {"id": "int_xyz", "status": "created"}} event: interaction.in_progress data: {"type": "interaction.in_progress", "interaction": {"id": "int_xyz", "status": "in_progress"}} event: step.start data: {"type": "step.start", "index": 0, "step": {"type": "thought"}} event: step.delta data: {"type": "step.delta", "index": 0, "delta": {"type": "thought", "text": "User wants an explanation."}} event: step.stop data: {"type": "step.stop", "index": 0, "status": "done"} event: step.start data: {"type": "step.start", "index": 1, "step": {"type": "model_output"}} event: step.delta data: {"type": "step.delta", "index": 1, "delta": {"type": "text", "text": "Hello"}} event: step.stop data: {"type": "step.stop", "index": 1, "status": "done"} event: interaction.completed data: {"type": "interaction.completed", "interaction": {"id": "int_xyz", "status": "completed", "usage": {"prompt_tokens": 10, "completion_tokens": 5, "total_tokens": 15}}} ```
Strumenti di streaming e chiamate di funzioni
Il comportamento degli strumenti nel flusso è cambiato in modo significativo rispetto a generateContent per fornire un controllo e una visibilità più granulari.
Prima di (generateContent)
Con generateContent, le chiamate di funzione di streaming arrivavano complete in un unico blocco. Non potevi vedere gli argomenti generati in tempo reale, quindi il gestore controllava semplicemente la presenza di un oggetto functionCall completo.
Python
from google import genai
from google.genai import types
client = genai.Client()
stream = client.models.generate_content_stream(
model="gemini-2.5-flash",
contents="What's the weather in Boston?",
config=types.GenerateContentConfig(tools=[weather_tool]),
)
for chunk in stream:
# Function calls arrived complete — no partial arguments
if chunk.candidates[0].content.parts[0].function_call:
fc = chunk.candidates[0].content.parts[0].function_call
print(f"Call: {fc.name}({fc.args})")
elif chunk.text:
print(chunk.text, end="")
JavaScript
import { GoogleGenAI } from '@google/genai';
const client = new GoogleGenAI({});
const stream = await client.models.generateContentStream({
model: 'gemini-2.5-flash',
contents: "What's the weather in Boston?",
config: { tools: [weatherTool] }
});
for await (const chunk of stream) {
const part = chunk.candidates[0].content.parts[0];
if (part.functionCall) {
console.log(`Call: ${part.functionCall.name}(${JSON.stringify(part.functionCall.args)})`);
} else if (part.text) {
process.stdout.write(part.text);
}
}
REST
# Request
curl -X POST "https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:streamGenerateContent" \
-H "Content-Type: application/json" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-d '{
"contents": [{"parts": [{"text": "What'\''s the weather in Boston?"}]}],
"tools": [{"functionDeclarations": [{"name": "get_weather", "parameters": {"type": "OBJECT", "properties": {"location": {"type": "STRING"}}}}]}]
}'
# Response stream — function call arrives complete in one chunk
{"candidates": [{"content": {"parts": [{"functionCall": {"name": "get_weather", "args": {"location": "Boston, MA"}}}]}}]}
After (API Interactions)
L'API Interactions trasmette gli argomenti della chiamata di funzione carattere per carattere come eventi arguments. L'intero ciclo di vita dello strumento, ovvero pensiero, chiamata, risultato e output, si svolge come una serie di passaggi distinti.
Python
from google import genai
client = genai.Client()
stream = client.interactions.create(
model="gemini-3.5-flash",
input="What's the weather in Boston?",
tools=[get_weather_tool],
stream=True,
)
for event in stream:
if event.event_type == "step.start":
if event.step.type == "function_call":
print(f"Calling: {event.step.name}")
elif event.event_type == "step.delta":
if event.delta.type == "arguments":
print(f" args: {event.delta.partial_arguments}")
elif event.delta.type == "text":
print(event.delta.text, end="")
elif event.event_type == "interaction.completed":
print("\n--- Done ---")
JavaScript
import { GoogleGenAI } from '@google/genai';
const client = new GoogleGenAI({});
const stream = await client.interactions.create({
model: 'gemini-3.5-flash',
input: "What's the weather in Boston?",
tools: [getWeatherTool],
stream: true,
});
for await (const event of stream) {
if (event.event_type === 'step.start') {
if (event.step.type === 'function_call') {
console.log(`Calling: ${event.step.name}`);
}
} else if (event.event_type === 'step.delta') {
if (event.delta.type === 'arguments') {
console.log(` args: ${event.delta.partial_arguments}`);
} else if (event.delta.type === 'text') {
process.stdout.write(event.delta.text);
}
} else if (event.event_type === 'interaction.completed') {
console.log('\n--- Done ---');
}
}
REST
# Request
curl -X POST "https://generativelanguage.googleapis.com/v1beta2/interactions" \
-H "Content-Type: application/json" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-d '{
"model": "gemini-3.5-flash",
"input": "What'\''s the weather in Boston?",
"tools": [{"type": "function", "name": "get_weather", "parameters": {"type": "object", "properties": {"location": {"type": "string"}}}}],
"stream": true
}'
# Response stream
// Interaction created
event: interaction.created
data: {"type": "interaction.created", "interaction": {"id": "int_xyz", "status": "created"}}
event: interaction.in_progress
data: {"type": "interaction.in_progress", "interaction": {"id": "int_xyz", "status": "in_progress"}}
// ── Step 0: Thought ──────────────────────────────────
event: step.start
data: {"type": "step.start", "index": 0, "step": {"type": "thought"}}
event: step.delta
data: {"type": "step.delta", "index": 0, "delta": {"type": "thought", "text": "The user wants weather data for Boston. I'll call the get_weather tool."}}
event: step.stop
data: {"type": "step.stop", "index": 0, "status": "done"}
// ── Step 1: Function Call (arguments streamed) ───────
event: step.start
data: {"type": "step.start", "index": 1, "step": {"type": "function_call", "id": "fc_1", "name": "get_weather"}}
event: step.delta
data: {"type": "step.delta", "index": 1, "delta": {"type": "arguments", "partial_arguments": "{\"location\": \"Boston, MA\"}"}}
event: step.stop
data: {"type": "step.stop", "index": 1, "status": "waiting"}
// The interaction pauses — the model needs the tool result before continuing.
event: interaction.requires_action
data: {"type": "interaction.requires_action", "interaction": {"id": "int_xyz", "status": "requires_action"}}
// ── (Client submits the tool result) ──────────────────
// The client calls interactions.create with the function_result as input
// and the previous interaction's ID, then resumes consuming the stream.
event: interaction.in_progress
data: {"type": "interaction.in_progress", "interaction": {"id": "int_xyz", "status": "in_progress"}}
// ── Step 2: Function Result (echoed back, no deltas) ─
event: step.start
data: {"type": "step.start", "index": 2, "step": {"type": "function_result", "call_id": "fc_1", "name": "get_weather", "result": [{"type": "text", "text": "52°F, rain"}]}}
event: step.stop
data: {"type": "step.stop", "index": 2, "status": "done"}
// ── Step 3: Thought ──────────────────────────────────
event: step.start
data: {"type": "step.start", "index": 3, "step": {"type": "thought"}}
event: step.delta
data: {"type": "step.delta", "index": 3, "delta": {"type": "thought", "text": "Got weather data. Composing the final response."}}
event: step.stop
data: {"type": "step.stop", "index": 3, "status": "done"}
// ── Step 4: Model Output (text streamed) ─────────────
event: step.start
data: {"type": "step.start", "index": 4, "step": {"type": "model_output"}}
event: step.delta
data: {"type": "step.delta", "index": 4, "delta": {"type": "text", "text": "It's currently 52°F and rainy in Boston."}}
event: step.stop
data: {"type": "step.stop", "index": 4, "status": "done"}
// ── Interaction complete ─────────────────────────────
event: interaction.completed
data: {"type": "interaction.completed", "interaction": {"id": "int_xyz", "status": "completed", "usage": {"prompt_tokens": 256, "completion_tokens": 128, "total_tokens": 384}}}