Este guia ajuda você a migrar da API generateContent para a API Interactions.
A API Interactions é a interface padrão para criar com o Gemini. Ela é otimizada para fluxos de trabalho de agentes, gerenciamento de estado do lado do servidor e conversas complexas multimodais e multiturno, mas ainda oferece suporte total a solicitações simples de turno único sem estado. Embora a generateContent continue com suporte total, recomendamos a API Interactions para todos os novos desenvolvimentos.
Por que migrar?
A API Interactions oferece uma maneira mais estruturada e eficiente de criar com o Gemini:
- Gerenciamento de histórico do lado do servidor: fluxos multiturno simplificados usando
previous_interaction_id. O servidor ativa o estado por padrão (store=true), mas você pode ativar o comportamento sem estado definindostore=false. - Etapas de execução observáveis: as etapas digitadas facilitam a depuração de fluxos complexos e a renderização da interface para eventos intermediários (como pensamentos ou widgets de pesquisa).
- Criada para fluxos de trabalho de agentes: suporte nativo para uso de ferramentas de várias etapas, orquestração e fluxos de raciocínio complexos usando etapas de execução digitadas.
- Tarefas longas e em segundo plano: oferece suporte ao descarregamento de operações que exigem muito tempo, como Deep Think e Deep Research, para processos em segundo plano usando
background=true. - Acesso a novos modelos e recursos: no futuro, novos modelos além da família principal, além de novos recursos e ferramentas de agentes, serão lançados exclusivamente na API Interactions. A
generateContentcontinuará com suporte total para casos de uso atuais.
Entrada/Saída básica
Esta seção mostra como migrar uma solicitação simples de geração de texto.
Antes (generateContent)
A API generateContent não tem estado e retorna a resposta diretamente. A estrutura de resposta envolve a saída em uma lista de candidates, cada uma contendo content com uma lista de parts a serem analisadas.
Python
from google import genai
client = genai.Client()
response = client.models.generate_content(
model="gemini-2.5-flash", contents="Tell me a joke."
)
print(response.text)
JavaScript
import { GoogleGenAI } from '@google/genai';
const ai = new GoogleGenAI({});
const response = await ai.models.generateContent({
model: "gemini-2.5-flash",
contents: "Tell me a joke.",
});
console.log(response.text);
REST
# Request
curl -X POST "https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent" \
-H "Content-Type: application/json" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-d '{
"contents": [{
"parts": [{
"text": "Tell me a joke."
}]
}]
}'
# Response
{
"candidates": [
{
"content": {
"parts": [
{
"text": "Why did the chicken cross the road? To get to the other side!"
}
],
"role": "model"
},
"finishReason": "STOP",
"index": 0
}
],
"usageMetadata": {
"promptTokenCount": 4,
"candidatesTokenCount": 12,
"totalTokenCount": 16
}
}
Depois (API Interactions)
A API Interactions retorna um recurso de interação armazenado com uma linha do tempo steps. Em vez de percorrer candidatos e partes, inspecione a matriz steps para encontrar o tipo de saída desejado.
Python
from google import genai
client = genai.Client()
# The input can be a simple string shorthand
interaction = client.interactions.create(
model="gemini-3-flash-preview", input="Tell me a joke."
)
# Inspect the steps manually
for step in interaction.steps:
if step.type == "model_output":
print(step.content[0].text)
JavaScript
import { GoogleGenAI } from '@google/genai';
const client = new GoogleGenAI({});
let interaction = await client.interactions.create({
model: 'gemini-3-flash-preview',
input: 'Tell me a joke.'
});
// Manual inspection
const modelStep = interaction.steps.find(s => s.type === 'model_output');
console.log(modelStep.content[0].text);
REST
# Request
curl -X POST "https://generativelanguage.googleapis.com/v1beta2/interactions" \
-H "Content-Type: application/json" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-d '{
"model": "gemini-3-flash-preview",
"input": "Tell me a joke."
}'
# Response
{
"id": "int_123",
"status": "completed",
"steps": [
{
"type": "user_input",
"status": "done",
"content": [
{
"type": "text",
"text": "Tell me a joke."
}
]
},
{
"type": "model_output",
"status": "done",
"content": [
{
"type": "text",
"text": "Why did the chicken cross the road?"
}
]
}
]
}
Conversas com vários turnos
A API Interactions armazena interações por padrão, permitindo o gerenciamento de estado do lado do servidor para conversas multiturno.
Antes (generateContent)
Na generateContent, é necessário gerenciar manualmente o histórico de conversas usando a matriz contents ou um auxiliar de chat do lado do cliente.
Python
Usando o auxiliar de chat (recomendado)
from google import genai
client = genai.Client()
chat = client.chats.create(model="gemini-2.5-flash")
response1 = chat.send_message("Hi, my name is Phil.")
print(response1.text)
response2 = chat.send_message("What is my name?")
print(response2.text)
Gerenciamento manual do histórico
from google import genai
from google.genai import types
client = genai.Client()
# The second turn requires sending the entire history
response = client.models.generate_content(
model="gemini-2.5-flash",
contents=[
types.Content(
role="user", parts=[types.Part.from_text("Hi, my name is Phil.")]
),
types.Content(
role="model",
parts=[types.Part.from_text("Hi Phil, how can I help you?")],
),
types.Content(
role="user", parts=[types.Part.from_text("What is my name?")]
),
],
)
print(response.text)
JavaScript
Usando o auxiliar de chat (recomendado)
import { GoogleGenAI } from '@google/genai';
const client = new GoogleGenAI({});
const chat = client.chats.create({ model: 'gemini-2.5-flash' });
let response = await chat.sendMessage({ message: 'Hi, my name is Phil.' });
console.log(response.text);
response = await chat.sendMessage({ message: 'What is my name?' });
console.log(response.text);
Gerenciamento manual do histórico
import { GoogleGenAI } from '@google/genai';
const client = new GoogleGenAI({});
// The second turn requires sending the entire history
const response = await client.models.generateContent({
model: 'gemini-2.5-flash',
contents: [
{ role: 'user', parts: [{ text: 'Hi, my name is Phil.' }] },
{ role: 'model', parts: [{ text: 'Hi Phil, how can I help you?' }] },
{ role: 'user', parts: [{ text: 'What is my name?' }] }
]
});
console.log(response.text);
REST
# Request (the second turn requires sending the entire history)
curl -X POST "https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent" \
-H "Content-Type: application/json" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-d '{
"contents": [
{"role": "user", "parts": [{"text": "Hi, my name is Phil."}]},
{"role": "model", "parts": [{"text": "Hi Phil, how can I help you?"}]},
{"role": "user", "parts": [{"text": "What is my name?"}]}
]
}'
# Response
{
"candidates": [
{
"content": {
"parts": [
{
"text": "Your name is Phil."
}
],
"role": "model"
},
"finishReason": "STOP",
"index": 0
}
]
}
Depois (API Interactions)
A API Interactions gerencia o estado no servidor. Você continua uma conversa referenciando o previous_interaction_id.
Python
from google import genai
client = genai.Client()
# First turn
interaction1 = client.interactions.create(
model="gemini-3-flash-preview", input="Hi, my name is Phil."
)
print(interaction1.steps[-1].content[0].text)
# Second turn (passing previous_interaction_id)
interaction2 = client.interactions.create(
model="gemini-3-flash-preview",
previous_interaction_id=interaction1.id,
input="What is my name?",
)
print(interaction2.steps[-1].content[0].text)
JavaScript
import { GoogleGenAI } from '@google/genai';
const client = new GoogleGenAI({});
// First turn
let interaction = await client.interactions.create({
model: 'gemini-3-flash-preview',
input: 'Hi, my name is Phil.'
});
console.log(interaction.steps.at(-1).content[0].text);
// Second turn
interaction = await client.interactions.create({
model: 'gemini-3-flash-preview',
previous_interaction_id: interaction.id,
input: 'What is my name?'
});
console.log(interaction.steps.at(-1).content[0].text);
REST
# First Request
curl -X POST "https://generativelanguage.googleapis.com/v1beta2/interactions" \
-H "Content-Type: application/json" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-d '{
"model": "gemini-3-flash-preview",
"input": "Hi, my name is Phil."
}'
# Second Request (using ID from first response)
curl -X POST "https://generativelanguage.googleapis.com/v1beta2/interactions" \
-H "Content-Type: application/json" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-d '{
"model": "gemini-3-flash-preview",
"previous_interaction_id": "int_123",
"input": "What is my name?"
}'
# Response to Second Request
{
"id": "int_123",
"steps": [
{
"type": "user_input",
"status": "done",
"content": [{ "type": "text", "text": "Hi, my name is Phil." }]
},
{
"type": "model_output",
"status": "done",
"content": [{ "type": "text", "text": "Hello Phil! How can I help you today?" }]
},
{
"type": "user_input",
"status": "done",
"content": [{ "type": "text", "text": "What is my name?" }]
},
{
"type": "model_output",
"status": "done",
"content": [{ "type": "text", "text": "Your name is Phil." }]
}
]
}
Entradas multimodais
As duas APIs oferecem suporte a entradas multimodais (texto, imagens, vídeo etc.).
Antes (generateContent)
Na generateContent, você transmite uma lista de parts na matriz contents. A resposta retorna a saída nas parts do primeiro candidato.
Python
from google import genai
from google.genai import types
client = genai.Client()
with open("sample.jpg", "rb") as f:
image_bytes = f.read()
response = client.models.generate_content(
model="gemini-2.5-flash",
contents=[
types.Part.from_bytes(data=image_bytes, mime_type="image/jpeg"),
"Describe this image.",
],
)
print(response.text)
REST
# Request
curl -X POST "https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent" \
-H "Content-Type: application/json" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-d '{
"contents": [{
"parts": [
{
"inlineData": {
"mimeType": "image/jpeg",
"data": "..."
}
},
{
"text": "Describe this image."
}
]
}]
}'
# Response
{
"candidates": [
{
"content": {
"parts": [
{
"text": "This is a picture of a beautiful sunset."
}
],
"role": "model"
}
}
]
}
Depois (API Interactions)
Na API Interactions, você transmite uma matriz para o campo input. Para recuperar o conteúdo de saída, encontre a etapa model_output na linha do tempo.
Python
from google import genai
client = genai.Client()
# Assuming you have an image file
with open("sample.jpg", "rb") as f:
image_bytes = f.read()
interaction = client.interactions.create(
model="gemini-3-flash-preview",
input=[
{
"type": "image",
"mime_type": "image/jpeg",
"data": image_bytes,
},
{"type": "text", "text": "Describe this image."},
],
)
for step in interaction.steps:
if step.type == "model_output":
print(step.content[0].text)
JavaScript
import { GoogleGenAI } from '@google/genai';
import * as fs from 'fs';
const client = new GoogleGenAI({});
const imageBytes = fs.readFileSync('sample.jpg').toString('base64');
const interaction = await client.interactions.create({
model: 'gemini-3-flash-preview',
input: [
{
type: 'image',
mime_type: 'image/jpeg',
data: imageBytes
},
{
type: 'text',
text: 'Describe this image.'
}
]
});
for (const step of interaction.steps) {
if (step.type === 'model_output') {
console.log(step.content[0].text);
}
}
REST
# Request
curl -X POST "https://generativelanguage.googleapis.com/v1beta2/interactions" \
-H "Content-Type: application/json" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-d '{
"model": "gemini-3-flash-preview",
"input": [
{
"type": "image",
"mime_type": "image/jpeg",
"data": "..."
},
{
"type": "text",
"text": "Describe this image."
}
]
}'
# Response
{
"id": "int_multimodal",
"steps": [
{
"type": "user_input",
"status": "done",
"content": [
{
"type": "image",
"mime_type": "image/jpeg",
"data": "..."
},
{
"type": "text",
"text": "Describe this image."
}
]
},
{
"type": "model_output",
"status": "done",
"content": [
{
"type": "text",
"text": "This is a picture of a beautiful sunset over the mountains."
}
]
}
]
}
Resposta estruturada
Para que o modelo retorne um JSON que corresponda a um esquema específico, configure o formato de resposta.
Antes (generateContent)
Na generateContent, você configura o formato de saída usando o campo response_format aninhado no objeto generationConfig.
Python
from google import genai
from google.genai import types
from pydantic import BaseModel
client = genai.Client()
class Recipe(BaseModel):
recipe_name: str
ingredients: list[str]
response = client.models.generate_content(
model="gemini-2.5-flash",
contents="Give me a recipe for chocolate chip cookies.",
config=types.GenerateContentConfig(
response_format=[
{
"type": "text",
"mime_type": "application/json",
"schema": Recipe,
}
]
),
)
print(response.text)
REST
# Request
curl -X POST "https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent" \
-H "Content-Type: application/json" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-d '{
"contents": [{
"parts": [{
"text": "Give me a recipe for chocolate chip cookies."
}]
}],
"generationConfig": {
"responseFormat": [
{
"type": "text",
"mimeType": "application/json",
"schema": {
"type": "OBJECT",
"properties": {
"recipe_name": { "type": "STRING" },
"ingredients": {
"type": "ARRAY",
"items": { "type": "STRING" }
}
},
"required": ["recipe_name", "ingredients"]
}
}
]
}
}'
# Response
{
"candidates": [
{
"content": {
"parts": [
{
"text": "{\n \"recipe_name\": \"Chocolate Chip Cookies\",\n \"ingredients\": [\n \"1 cup butter\",\n \"1 cup sugar\",\n \"2 cups flour\",\n \"1 cup chocolate chips\"\n ]\n}"
}
],
"role": "model"
}
}
]
}
Depois (API Interactions)
Na API Interactions, os controles de formato de saída são movidos para uma matriz response_format de nível superior.
Python
from google import genai
from pydantic import BaseModel
client = genai.Client()
class Recipe(BaseModel):
recipe_name: str
ingredients: list[str]
interaction = client.interactions.create(
model="gemini-3-flash-preview",
input="Give me a recipe for chocolate chip cookies.",
response_format=[
{
"type": "text",
"mime_type": "application/json",
"schema": Recipe,
}
],
)
for step in interaction.steps:
if step.type == "model_output":
print(step.content[0].text)
JavaScript
import { GoogleGenAI } from '@google/genai';
const client = new GoogleGenAI({});
const interaction = await client.interactions.create({
model: 'gemini-3-flash-preview',
input: 'Give me a recipe for chocolate chip cookies.',
response_format: [
{
type: 'text',
mime_type: 'application/json',
schema: {
type: 'object',
properties: {
recipe_name: { type: 'string' },
ingredients: {
type: 'array',
items: { type: 'string' }
}
},
required: ['recipe_name', 'ingredients']
}
}
]
});
for (const step of interaction.steps) {
if (step.type === 'model_output') {
console.log(step.content[0].text);
}
}
REST
# Request
curl -X POST "https://generativelanguage.googleapis.com/v1beta2/interactions" \
-H "Content-Type: application/json" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-d '{
"model": "gemini-3-flash-preview",
"input": "Give me a recipe for chocolate chip cookies.",
"response_format": [
{
"type": "text",
"mime_type": "application/json",
"schema": {
"type": "OBJECT",
"properties": {
"recipe_name": { "type": "STRING" },
"ingredients": {
"type": "ARRAY",
"items": { "type": "STRING" }
}
},
"required": ["recipe_name", "ingredients"]
}
}
]
}'
# Response
{
"id": "int_structured",
"steps": [
{
"type": "user_input",
"status": "done",
"content": [{ "type": "text", "text": "Give me a recipe for chocolate chip cookies." }]
},
{
"type": "model_output",
"status": "done",
"content": [
{
"type": "text",
"text": "{\n \"recipe_name\": \"Chocolate Chip Cookies\",\n \"ingredients\": [\n \"1 cup butter\",\n \"1 cup sugar\",\n \"2 cups flour\",\n \"1 cup chocolate chips\"\n ]\n}"
}
]
}
]
}
Geração multimodal
Ao gerar conteúdo em modalidades além do texto (como imagens ou áudio), a principal diferença é como a resposta estrutura a mídia gerada.
Antes (generateContent)
Na generateContent, a resposta retorna a mídia gerada diretamente nas parts do candidato, normalmente como dados base64 em inlineData.
# Response structure concept
{
"candidates": [
{
"content": {
"parts": [
{
"text": "Here is your generated image:"
},
{
"inlineData": {
"mimeType": "image/jpeg",
"data": "...base64..."
}
}
]
}
}
]
}
Depois (API Interactions)
Na API Interactions, a mídia gerada aparece como itens distintos na matriz content de uma etapa model_output na linha do tempo, mantendo o fluxo cronológico da interação.
# Response structure concept
{
"id": "int_123",
"steps": [
{
"type": "model_output",
"status": "done",
"content": [
{
"type": "text",
"text": "Here is your generated image:"
},
{
"type": "image",
"mime_type": "image/jpeg",
"data": "...base64..." // Or a reference URL in future
}
]
}
]
}
Isso mantém a análise de resposta consistente com a forma como as entradas e saídas de texto são processadas. Tudo é uma etapa na linha do tempo.
Ferramentas do lado do servidor
O Gemini oferece suporte a ferramentas integradas do lado do servidor, como o embasamento da Pesquisa Google. A principal diferença é como a resposta representa a execução da ferramenta.
Antes (generateContent)
Na generateContent, as ferramentas do lado do servidor são em grande parte opacas. Você ativa a ferramenta e recebe uma resposta final com um objeto groundingMetadata separado. É importante ressaltar que as citações não são inline. O groundingSupports usa índices de caracteres para mapear segmentos de texto de volta às fontes da Web em groundingChunks.
Python
from google import genai
from google.genai import types
client = genai.Client()
response = client.models.generate_content(
model="gemini-2.5-flash",
contents="Who won Euro 2024?",
config=types.GenerateContentConfig(
tools=[{"google_search": {}}]
),
)
# Access search entry point (widget) and citations
metadata = response.candidates[0].grounding_metadata
if metadata.search_entry_point:
print(f"Search Entry Point: {metadata.search_entry_point.rendered_content}")
for support in metadata.grounding_supports:
print(f"Citation: {support.segment.text}")
JavaScript
import { GoogleGenAI } from '@google/genai';
const client = new GoogleGenAI({});
const response = await client.models.generateContent({
model: 'gemini-2.5-flash',
contents: 'Who won Euro 2024?',
config: {
tools: [{ google_search: {} }]
}
});
const metadata = response.candidates[0].groundingMetadata;
if (metadata.searchEntryPoint) {
console.log(`Search Entry Point: ${metadata.searchEntryPoint.renderedContent}`);
}
for (const support of metadata.groundingSupports) {
console.log(`Citation: ${support.segment.text}`);
}
REST
# Request
curl -X POST "https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent" \
-H "Content-Type: application/json" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-d '{
"contents": [{
"parts": [{
"text": "Who won Euro 2024?"
}]
}],
"tools": [{
"googleSearchRetrieval": {}
}]
}'
# Response
{
"candidates": [
{
"content": {
"parts": [
{
"text": "Spain won Euro 2024, defeating England 2-1 in the final. This victory marks Spain's record fourth European Championship title."
}
],
"role": "model"
},
"groundingMetadata": {
"webSearchQueries": [
"UEFA Euro 2024 winner",
"who won euro 2024"
],
"searchEntryPoint": {
"renderedContent": "<!-- HTML and CSS for the search widget -->"
},
"groundingChunks": [
{"web": {"uri": "https://vertexaisearch.cloud.google.com.....", "title": "aljazeera.com"}},
{"web": {"uri": "https://vertexaisearch.cloud.google.com.....", "title": "uefa.com"}}
],
"groundingSupports": [
{
"segment": {"startIndex": 0, "endIndex": 85, "text": "Spain won Euro 2024, defeatin..."},
"groundingChunkIndices": [0]
},
{
"segment": {"startIndex": 86, "endIndex": 210, "text": "This victory marks Spain's..."},
"groundingChunkIndices": [0, 1]
}
]
}
}
]
}
Depois (API Interactions)
Na API Interactions, as ferramentas do lado do servidor oferecem transparência total da linha do tempo. A API registra a chamada e o resultado como steps de execução distintos (google_search_call e google_search_result), expondo exatamente quais dados o modelo recuperou.
Além disso, a API retorna citações inline. Em vez de mapear índices de um objeto de metadados separado, o item de texto na etapa model_output contém a própria matriz annotations que vincula diretamente à fonte.
Python
from google import genai
client = genai.Client()
interaction = client.interactions.create(
model="gemini-3-flash-preview",
input="Who won Euro 2024?",
tools=[{"type": "google_search"}],
)
for step in interaction.steps:
if step.type == "google_search_result":
print(f"Search Suggestions: {step.search_suggestions}")
elif step.type == "model_output":
print(f"Answer: {step.content[0].text}")
if step.content[0].annotations:
for anno in step.content[0].annotations:
print(f"Citation: {anno.title} ({anno.uri})")
JavaScript
import { GoogleGenAI } from '@google/genai';
const client = new GoogleGenAI({});
const interaction = await client.interactions.create({
model: 'gemini-3-flash-preview',
input: 'Who won Euro 2024?',
tools: [{ type: 'google_search' }]
});
for (const step of interaction.steps) {
if (step.type === 'google_search_result') {
console.log(`Search Suggestions: ${step.search_suggestions}`);
} else if (step.type === 'model_output') {
console.log(`Answer: ${step.content[0].text}`);
if (step.content[0].annotations) {
for (const anno of step.content[0].annotations) {
console.log(`Citation: ${anno.title} (${anno.uri})`);
}
}
}
}
REST
# Request
curl -X POST "https://generativelanguage.googleapis.com/v1beta2/interactions" \
-H "Content-Type: application/json" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-d '{
"model": "gemini-3-flash-preview",
"input": "Who won Euro 2024?",
"tools": [{"type": "google_search"}]
}'
# Response (showing grounding)
{
"id": "int_grounded",
"steps": [
{
"type": "user_input",
"status": "done",
"content": [{ "type": "text", "text": "Who won Euro 2024?" }]
},
{
"type": "google_search_call",
"status": "done",
"content": [{ "type": "text", "text": "UEFA Euro 2024 winner" }]
},
{
"type": "google_search_result",
"status": "done",
"content": [
{
"type": "text",
"text": "Spain won Euro 2024..."
}
]
},
{
"type": "model_output",
"status": "done",
"content": [
{
"type": "text",
"text": "Spain won Euro 2024, defeating England 2-1.",
"annotations": [
{
"start_index": 0,
"end_index": 42,
"uri": "https://vertexaisearch...",
"title": "aljazeera.com"
}
]
}
]
}
]
}
Chamadas de função
A estrutura das chamadas de função e dos resultados também mudou para se ajustar ao esquema de etapas.
Antes (generateContent)
Na generateContent, a resposta retorna chamadas de função nos candidatos.
Python
from google import genai
from google.genai import types
client = genai.Client()
# Step 1: Send prompt with tools
response = client.models.generate_content(
model="gemini-2.5-flash",
contents="What's the weather in Boston?",
config=types.GenerateContentConfig(tools=[weather_tool]),
)
# Assume model returned function_call
function_call = response.candidates[0].content.parts[0].function_call
print(f"Requested tool: {function_call.name}")
# Step 2: Execute local function and send result back
result = "52°F and rain"
response = client.models.generate_content(
model="gemini-2.5-flash",
contents=[
types.Content(
role="user",
parts=[
types.Part.from_text(text="What's the weather in Boston?")
],
),
response.candidates[0].content, # Model turn with function call
types.Content(
role="user",
parts=[
types.Part.from_function_response(
name=function_call.name,
response={"result": result},
)
],
),
],
config=types.GenerateContentConfig(tools=[weather_tool]),
)
print(response.text)
JavaScript
import { GoogleGenAI } from '@google/genai';
const client = new GoogleGenAI({});
// Step 1: Send prompt with tools
let response = await client.models.generateContent({
model: 'gemini-2.5-flash',
contents: "What's the weather in Boston?",
config: { tools: [weatherTool] }
});
const functionCall = response.candidates[0].content.parts[0].functionCall;
console.log(`Requested tool: ${functionCall.name}`);
// Step 2: Execute local function and send result back
const result = "52°F and rain";
response = await client.models.generateContent({
model: 'gemini-2.5-flash',
contents: [
{ role: 'user', parts: [{ text: "What's the weather in Boston?" }] },
response.candidates[0].content, // Model turn
{
role: 'user',
parts: [{
functionResponse: {
name: functionCall.name,
response: { result: result }
}
}]
}
],
config: { tools: [weatherTool] }
});
console.log(response.text);
REST
# Request
curl -X POST "https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent" \
-H "Content-Type: application/json" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-d '{
"contents": [{
"parts": [{
"text": "What is the weather like in Boston, MA?"
}]
}],
"tools": [{
"functionDeclarations": [{
"name": "get_weather",
"description": "Get the current weather",
"parameters": {
"type": "OBJECT",
"properties": {
"location": {"type": "STRING"}
},
"required": ["location"]
}
}]
}]
}'
# Response
{
"candidates": [
{
"content": {
"parts": [
{
"functionCall": {
"name": "get_weather",
"args": { "location": "Boston, MA" }
}
}
],
"role": "model"
},
"finishReason": "STOP",
"index": 0
}
]
}
Depois (API Interactions)
As chamadas e os resultados de ferramentas agora são etapas distintas na linha do tempo.
Python
from google import genai
client = genai.Client()
weather_tool = {
"type": "function",
"name": "get_weather",
"description": "Gets weather",
"parameters": {
"type": "object",
"properties": {
"location": {"type": "string"}
},
},
}
interaction = client.interactions.create(
model="gemini-3-flash-preview",
input="What's the weather in Boston?",
tools=[weather_tool],
)
# Check if the model requested a tool call
for step in interaction.steps:
if step.type == "function_call":
print(f"Executing {step.name} for {step.arguments}")
# Execute your local function here...
result = "52°F and rain"
# Submit the result back as a step
interaction = client.interactions.create(
model="gemini-3-flash-preview",
previous_interaction_id=interaction.id,
input=[
{
"type": "function_result",
"call_id": step.id,
"name": step.name,
"result": [{"type": "text", "text": result}],
}
],
)
# Inspect steps for final response
for s in interaction.steps:
if s.type == "model_output":
print(s.content[0].text)
JavaScript
import { GoogleGenAI } from '@google/genai';
const client = new GoogleGenAI({});
const weatherTool = {
type: "function",
name: "get_weather",
description: "Get weather for a location",
parameters: {
type: "object",
properties: {
location: { type: "string" }
},
required: ["location"]
}
};
const interaction = await client.interactions.create({
model: 'gemini-3-flash-preview',
input: "What's the weather in Boston?",
tools: [weatherTool]
});
// Check if the model requested a tool call
for (const step of interaction.steps) {
if (step.type === 'function_call') {
console.log(`Executing ${step.name} for ${JSON.stringify(step.arguments)}`);
const result = "52°F and rain";
// Submit the result back as a step
const nextInteraction = await client.interactions.create({
model: 'gemini-3-flash-preview',
previous_interaction_id: interaction.id,
input: [
{
type: 'function_result',
call_id: step.id,
name: step.name,
result: [{ type: 'text', text: result }]
}
]
});
// Inspect steps for final response
for (const s of nextInteraction.steps) {
if (s.type === 'model_output') {
console.log(s.content[0].text);
}
}
}
}
REST
# Initial Request
curl -X POST "https://generativelanguage.googleapis.com/v1beta2/interactions" \
-H "Content-Type: application/json" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-d '{
"model": "gemini-3-flash-preview",
"input": "What's the weather in Boston?",
"tools": [{
"type": "function",
"name": "get_weather",
"description": "Get weather for a location",
"parameters": {
"type": "object",
"properties": {
"location": { "type": "string" }
},
"required": ["location"]
}
}]
}'
# Response (requires action)
{
"id": "int_001",
"status": "requires_action",
"steps": [
{
"type": "user_input",
"status": "done",
"content": [
{ "type": "text", "text": "What's the weather in Boston?" }
]
},
{
"type": "function_call",
"status": "waiting",
"id": "fc_1",
"name": "get_weather",
"arguments": { "location": "Boston, MA" }
}
]
}
# Submit Tool Result Request
curl -X POST "https://generativelanguage.googleapis.com/v1beta2/interactions" \
-H "Content-Type: application/json" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-d '{
"model": "gemini-3-flash-preview",
"previous_interaction_id": "int_001",
"input": {
"type": "function_result",
"call_id": "fc_1",
"name": "get_weather",
"result": [
{ "type": "text", "text": "52°F with rain" }
]
}
}'
# Final Response
{
"id": "int_002",
"status": "completed",
"steps": [
{
"type": "function_result",
"call_id": "fc_1",
"name": "get_weather",
"result": [
{ "type": "text", "text": "52°F with rain" }
]
},
{
"type": "model_output",
"status": "done",
"content": [
{ "type": "text", "text": "It's 52°F with rain in Boston." }
]
}
]
}
Streaming
Uma diferença fundamental no streaming é que a API Interactions usa o mesmo endpoint com "stream": true no corpo da solicitação, enquanto a API generateContent exigia a chamada de um endpoint dedicado (:streamGenerateContent).
Além disso, os eventos de streaming agora usam tipos especializados para monitorar o ciclo de vida da interação e acompanhar as etapas de execução ao longo da linha do tempo.
Antes (generateContentStream)
Com a generateContent, você consome um fluxo de blocos de resposta.
Python
response = client.models.generate_content_stream(
model="gemini-2.5-flash", contents="Tell me a story"
)
for chunk in response:
print(chunk.text, end="")
JavaScript
const responseStream = await client.models.generateContentStream({
model: 'gemini-2.5-flash',
contents: 'Tell me a story',
});
for await (const chunk of responseStream) {
process.stdout.write(chunk.text);
}
REST
# Request
curl -X POST "https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:streamGenerateContent" \
-H "Content-Type: application/json" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-d '{
"contents": [{
"parts": [{
"text": "Tell me a story"
}]
}]
}'
# Response stream
event: content.start
data: {"event_type": "content.start", "index": 0, "content": {"type": "thought"}}
event: content.delta
data: {"event_type": "content.delta", "index": 0, "delta": {"type": "thought_summary", "text": "User wants an explanation."}}
event: content.stop
data: {"event_type": "content.stop", "index": 0}
event: content.start
data: {"event_type": "content.start", "index": 1, "content": {"type": "text"}}
event: content.delta
data: {"event_type": "content.delta", "index": 1, "delta": {"type": "text", "text": "Hello"}}
event: content.stop
data: {"event_type": "content.stop", "index": 1}
Depois (API Interactions)
Na API Interactions, o streaming usa eventos enviados pelo servidor (SSE) e tipos delta especializados para representar as etapas de execução à medida que acontecem.
Python
from google import genai
client = genai.Client()
stream = client.interactions.create(
model="gemini-3-flash-preview",
input="Tell me a story",
stream=True,
)
for event in stream:
if event.event_type == "step.delta":
if event.delta.type == "text":
print(event.delta.text, end="", flush=True)
elif event.event_type == "interaction.complete":
print(f"\n\n--- Stream Finished ---")
JavaScript
import { GoogleGenAI } from '@google/genai';
const client = new GoogleGenAI({});
const stream = await client.interactions.create({
model: 'gemini-3-flash-preview',
input: 'Tell me a story',
stream: true,
});
for await (const event of stream) {
if (event.event_type === 'step.delta') {
if (event.delta.type === 'text' && 'text' in event.delta) {
process.stdout.write(event.delta.text);
}
} else if (event.event_type === 'interaction.complete') {
console.log('\n\n--- Stream Finished ---');
}
}
REST
# Exemplo de saída de fluxo de SSE event: interaction.created data: {"type": "interaction.created", "interaction": {"id": "int_xyz", "status": "created"}} event: interaction.status_update data: {"type": "interaction.status_update", "status": "in_progress"} event: step.start data: {"type": "step.start", "index": 0, "step": {"type": "thought"}} event: step.delta data: {"type": "step.delta", "index": 0, "delta": {"type": "thought", "text": "User wants an explanation."}} event: step.stop data: {"type": "step.stop", "index": 0, "status": "done"} event: step.start data: {"type": "step.start", "index": 1, "step": {"type": "model_output"}} event: step.delta data: {"type": "step.delta", "index": 1, "delta": {"type": "text", "text": "Hello"}} event: step.stop data: {"type": "step.stop", "index": 1, "status": "done"} event: interaction.complete data: {"type": "interaction.complete", "interaction": {"id": "int_xyz", "status": "completed", "usage": {"prompt_tokens": 10, "completion_tokens": 5, "total_tokens": 15}}} ```
Ferramentas de streaming e chamadas de função
A maneira como as ferramentas se comportam no fluxo mudou significativamente da generateContent para oferecer mais controle e visibilidade.
Antes (generateContent)
Com a generateContent, as chamadas de função de streaming chegaram completas em um único bloco. Não era possível ver os argumentos sendo gerados em tempo real, então o gerenciador simplesmente verificou um objeto functionCall completo.
Python
from google import genai
from google.genai import types
client = genai.Client()
stream = client.models.generate_content_stream(
model="gemini-2.5-flash",
contents="What's the weather in Boston?",
config=types.GenerateContentConfig(tools=[weather_tool]),
)
for chunk in stream:
# Function calls arrived complete — no partial arguments
if chunk.candidates[0].content.parts[0].function_call:
fc = chunk.candidates[0].content.parts[0].function_call
print(f"Call: {fc.name}({fc.args})")
elif chunk.text:
print(chunk.text, end="")
JavaScript
import { GoogleGenAI } from '@google/genai';
const client = new GoogleGenAI({});
const stream = await client.models.generateContentStream({
model: 'gemini-2.5-flash',
contents: "What's the weather in Boston?",
config: { tools: [weatherTool] }
});
for await (const chunk of stream) {
// Function calls arrived complete — no partial arguments
const part = chunk.candidates[0].content.parts[0];
if (part.functionCall) {
console.log(`Call: ${part.functionCall.name}(${JSON.stringify(part.functionCall.args)})`);
} else if (part.text) {
process.stdout.write(part.text);
}
}
REST
# Request
curl -X POST "https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:streamGenerateContent" \
-H "Content-Type: application/json" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-d '{
"contents": [{"parts": [{"text": "What'\''s the weather in Boston?"}]}],
"tools": [{"functionDeclarations": [{"name": "get_weather", "parameters": {"type": "OBJECT", "properties": {"location": {"type": "STRING"}}}}]}]
}'
# Response stream — function call arrives complete in one chunk
{"candidates": [{"content": {"parts": [{"functionCall": {"name": "get_weather", "args": {"location": "Boston, MA"}}}]}}]}
Depois (API Interactions)
A API Interactions transmite argumentos de chamada de função caractere por caractere como eventos arguments. Todo o ciclo de vida da ferramenta (pensamento, chamada, resultado e saída) é reproduzido como uma série de etapas distintas.
Python
from google import genai
client = genai.Client()
stream = client.interactions.create(
model="gemini-3-flash-preview",
input="What's the weather in Boston?",
tools=[get_weather_tool],
stream=True,
)
for event in stream:
if event.event_type == "step.start":
if event.step.type == "function_call":
print(f"Calling: {event.step.name}")
elif event.event_type == "step.delta":
if event.delta.type == "arguments":
print(f" args: {event.delta.partial_arguments}")
elif event.delta.type == "text":
print(event.delta.text, end="")
elif event.event_type == "interaction.complete":
print("\n--- Done ---")
JavaScript
import { GoogleGenAI } from '@google/genai';
const client = new GoogleGenAI({});
const stream = await client.interactions.create({
model: 'gemini-3-flash-preview',
input: "What's the weather in Boston?",
tools: [getWeatherTool],
stream: true,
});
for await (const event of stream) {
if (event.event_type === 'step.start') {
if (event.step.type === 'function_call') {
console.log(`Calling: ${event.step.name}`);
}
} else if (event.event_type === 'step.delta') {
if (event.delta.type === 'arguments') {
console.log(` args: ${event.delta.partial_arguments}`);
} else if (event.delta.type === 'text') {
process.stdout.write(event.delta.text);
}
} else if (event.event_type === 'interaction.complete') {
console.log('\n--- Done ---');
}
}
REST
# Request
curl -X POST "https://generativelanguage.googleapis.com/v1beta2/interactions" \
-H "Content-Type: application/json" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-d '{
"model": "gemini-3-flash-preview",
"input": "What'\''s the weather in Boston?",
"tools": [{"type": "function", "name": "get_weather", "parameters": {"type": "object", "properties": {"location": {"type": "string"}}}}],
"stream": true
}'
# Response stream
// Interaction created
event: interaction.created
data: {"type": "interaction.created", "interaction": {"id": "int_xyz", "status": "created"}}
event: interaction.status_update
data: {"type": "interaction.status_update", "status": "in_progress"}
// ── Step 0: Thought ──────────────────────────────────
event: step.start
data: {"type": "step.start", "index": 0, "step": {"type": "thought"}}
event: step.delta
data: {"type": "step.delta", "index": 0, "delta": {"type": "thought", "text": "The user wants weather data for Boston. I'll call the get_weather tool."}}
event: step.stop
data: {"type": "step.stop", "index": 0, "status": "done"}
// ── Step 1: Function Call (arguments streamed) ───────
event: step.start
data: {"type": "step.start", "index": 1, "step": {"type": "function_call", "id": "fc_1", "name": "get_weather"}}
event: step.delta
data: {"type": "step.delta", "index": 1, "delta": {"type": "arguments", "partial_arguments": "{\"location\": \"Boston, MA\"}"}}
event: step.stop
data: {"type": "step.stop", "index": 1, "status": "waiting"}
// The interaction pauses — the model needs the tool result before continuing.
event: interaction.status_update
data: {"type": "interaction.status_update", "status": "requires_action"}
// ── (Client submits the tool result) ──────────────────
// The client calls interactions.create with the function_result as input
// and the previous interaction's ID, then resumes consuming the stream.
event: interaction.status_update
data: {"type": "interaction.status_update", "status": "in_progress"}
// ── Step 2: Function Result (echoed back, no deltas) ─
event: step.start
data: {"type": "step.start", "index": 2, "step": {"type": "function_result", "call_id": "fc_1", "name": "get_weather", "result": [{"type": "text", "text": "52°F, rain"}]}}
event: step.stop
data: {"type": "step.stop", "index": 2, "status": "done"}
// ── Step 3: Thought ──────────────────────────────────
event: step.start
data: {"type": "step.start", "index": 3, "step": {"type": "thought"}}
event: step.delta
data: {"type": "step.delta", "index": 3, "delta": {"type": "thought", "text": "Got weather data. Composing the final response."}}
event: step.stop
data: {"type": "step.stop", "index": 3, "status": "done"}
// ── Step 4: Model Output (text streamed) ─────────────
event: step.start
data: {"type": "step.start", "index": 4, "step": {"type": "model_output"}}
event: step.delta
data: {"type": "step.delta", "index": 4, "delta": {"type": "text", "text": "It's currently 52°F and rainy in Boston."}}
event: step.stop
data: {"type": "step.stop", "index": 4, "status": "done"}
// ── Interaction complete ─────────────────────────────
event: interaction.complete
data: {"type": "interaction.complete", "interaction": {"id": "int_xyz", "status": "completed", "usage": {"prompt_tokens": 256, "completion_tokens": 128, "total_tokens": 384}}}
Ferramentas do lado do servidor no fluxo
Ferramentas do lado do servidor, como a Pesquisa Google, se comportam de maneira diferente das chamadas de função no fluxo. A chamada e o resultado chegam completos no evento step.start sem deltas. Apenas step.start seguido imediatamente por step.stop:
// Server-side tool call — payload arrives complete in step.start
event: step.start
data: {"type": "step.start", "index": 4, "step": {"type": "google_search_call", "id": "gs_2", "query": "Alphabet Q4 2025 earnings"}}
event: step.stop
data: {"type": "step.stop", "index": 4, "status": "done"}
// Server-side tool result — also complete in step.start
event: step.start
data: {"type": "step.start", "index": 5, "step": {"type": "google_search_result", "call_id": "gs_2", "rendered_content": "<div>Alphabet Q4 2025 Revenue: $105.6B</div>", "signature": "abc123..."}}
event: step.stop
data: {"type": "step.stop", "index": 5, "status": "done"}