API Interactions

L'API Interactions è un'interfaccia unificata per interagire con i modelli Gemini e gli agenti. Semplifica la gestione dello stato, l'orchestrazione degli strumenti e le attività di lunga durata. Per una visione completa dello schema dell'API, consulta il Riferimento API.

Il seguente esempio mostra come chiamare l'API Interactions con un prompt di testo.

Python

from google import genai

client = genai.Client()

interaction =  client.interactions.create(
    model="gemini-3-pro-preview",
    input="Tell me a short joke about programming."
)

print(interaction.outputs[-1].text)

JavaScript

import { GoogleGenAI } from '@google/genai';

const client = new GoogleGenAI({});

const interaction =  await client.interactions.create({
    model: 'gemini-3-pro-preview',
    input: 'Tell me a short joke about programming.',
});

console.log(interaction.outputs[interaction.outputs.length - 1].text);

REST

curl -X POST "https://generativelanguage.googleapis.com/v1beta/interactions" \
-H "Content-Type: application/json" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-d '{
    "model": "gemini-3-pro-preview",
    "input": "Tell me a short joke about programming."
}'

Interazioni di base

L'API Interactions è disponibile tramite i nostri SDK esistenti. Il modo più semplice per interagire con il modello è fornire un prompt di testo. input può essere una stringa, un elenco contenente oggetti di contenuti o un elenco di turni con ruoli e oggetti di contenuti.

Python

from google import genai

client = genai.Client()

interaction =  client.interactions.create(
    model="gemini-2.5-flash",
    input="Tell me a short joke about programming."
)

print(interaction.outputs[-1].text)

JavaScript

import { GoogleGenAI } from '@google/genai';

const client = new GoogleGenAI({});

const interaction =  await client.interactions.create({
    model: 'gemini-2.5-flash',
    input: 'Tell me a short joke about programming.',
});

console.log(interaction.outputs[interaction.outputs.length - 1].text);

REST

curl -X POST "https://generativelanguage.googleapis.com/v1beta/interactions" \
-H "Content-Type: application/json" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-d '{
    "model": "gemini-2.5-flash",
    "input": "Tell me a short joke about programming."
}'

Conversazione

Puoi creare conversazioni a turni multipli in due modi:

  • Con stato, facendo riferimento a un'interazione precedente
  • Senza stato, fornendo l'intera cronologia della conversazione

Conversazione con stato

Passa il id dell'interazione precedente al parametro previous_interaction_id per continuare una conversazione.

Python

from google import genai

client = genai.Client()

# 1. First turn
interaction1 = client.interactions.create(
    model="gemini-2.5-flash",
    input="Hi, my name is Phil."
)
print(f"Model: {interaction1.outputs[-1].text}")

# 2. Second turn (passing previous_interaction_id)
interaction2 = client.interactions.create(
    model="gemini-2.5-flash",
    input="What is my name?",
    previous_interaction_id=interaction1.id
)
print(f"Model: {interaction2.outputs[-1].text}")

JavaScript

import { GoogleGenAI } from '@google/genai';

const client = new GoogleGenAI({});

// 1. First turn
const interaction1 = await client.interactions.create({
    model: 'gemini-2.5-flash',
    input: 'Hi, my name is Phil.'
});
console.log(`Model: ${interaction1.outputs[interaction1.outputs.length - 1].text}`);

// 2. Second turn (passing previous_interaction_id)
const interaction2 = await client.interactions.create({
    model: 'gemini-2.5-flash',
    input: 'What is my name?',
    previous_interaction_id: interaction1.id
});
console.log(`Model: ${interaction2.outputs[interaction2.outputs.length - 1].text}`);

REST

# 1. First turn
curl -X POST "https://generativelanguage.googleapis.com/v1beta/interactions" \
-H "Content-Type: application/json" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-d '{
    "model": "gemini-2.5-flash",
    "input": "Hi, my name is Phil."
}'

# 2. Second turn (Replace INTERACTION_ID with the ID from the previous interaction)
# curl -X POST "https://generativelanguage.googleapis.com/v1beta/interactions" \
# -H "Content-Type: application/json" \
# -H "x-goog-api-key: $GEMINI_API_KEY" \
# -d '{
#     "model": "gemini-2.5-flash",
#     "input": "What is my name?",
#     "previous_interaction_id": "INTERACTION_ID"
# }'

Recuperare interazioni stateful precedenti

Utilizzando l'interazione id per recuperare i turni precedenti della conversazione.

Python

previous_interaction = client.interactions.get("<YOUR_INTERACTION_ID>")

print(previous_interaction)

JavaScript

const previous_interaction = await client.interactions.get("<YOUR_INTERACTION_ID>");
console.log(previous_interaction);

REST

curl -X GET "https://generativelanguage.googleapis.com/v1beta/interactions/<YOUR_INTERACTION_ID>" \
-H "x-goog-api-key: $GEMINI_API_KEY"

Conversazione stateless

Puoi gestire manualmente la cronologia delle conversazioni lato client.

Python

from google import genai

client = genai.Client()

conversation_history = [
    {
        "role": "user",
        "content": "What are the three largest cities in Spain?"
    }
]

interaction1 = client.interactions.create(
    model="gemini-2.5-flash",
    input=conversation_history
)

print(f"Model: {interaction1.outputs[-1].text}")

conversation_history.append({"role": "model", "content": interaction1.outputs})
conversation_history.append({
    "role": "user", 
    "content": "What is the most famous landmark in the second one?"
})

interaction2 = client.interactions.create(
    model="gemini-2.5-flash",
    input=conversation_history
)

print(f"Model: {interaction2.outputs[-1].text}")

JavaScript

import { GoogleGenAI } from '@google/genai';

const client = new GoogleGenAI({});

const conversationHistory = [
    {
        role: 'user',
        content: "What are the three largest cities in Spain?"
    }
];

const interaction1 = await client.interactions.create({
    model: 'gemini-2.5-flash',
    input: conversationHistory
});

console.log(`Model: ${interaction1.outputs[interaction1.outputs.length - 1].text}`);

conversationHistory.push({ role: 'model', content: interaction1.outputs });
conversationHistory.push({
    role: 'user',
    content: "What is the most famous landmark in the second one?"
});

const interaction2 = await client.interactions.create({
    model: 'gemini-2.5-flash',
    input: conversationHistory
});

console.log(`Model: ${interaction2.outputs[interaction2.outputs.length - 1].text}`);

REST

 curl -X POST "https://generativelanguage.googleapis.com/v1beta/interactions" \
 -H "Content-Type: application/json" \
 -H "x-goog-api-key: $GEMINI_API_KEY" \
 -d '{
    "model": "gemini-2.5-flash",
    "input": [
        {
            "role": "user",
            "content": "What are the three largest cities in Spain?"
        },
        {
            "role": "model",
            "content": "The three largest cities in Spain are Madrid, Barcelona, and Valencia."
        },
        {
            "role": "user",
            "content": "What is the most famous landmark in the second one?"
        }
    ]
}'

Funzionalità multimodali

Puoi utilizzare l'API Interactions per casi d'uso multimodali come la comprensione delle immagini o la generazione di video.

Comprensione multimodale

Puoi fornire dati multimodali come dati codificati in base64 in linea o utilizzando l'API Files per file più grandi.

Comprensione delle immagini

Python

import base64
from pathlib import Path
from google import genai

client = genai.Client()

# Read and encode the image
with open(Path(__file__).parent / "car.png", "rb") as f:
    base64_image = base64.b64encode(f.read()).decode('utf-8')

interaction = client.interactions.create(
    model="gemini-2.5-flash",
    input=[
        {"type": "text", "text": "Describe the image."},
        {"type": "image", "data": base64_image, "mime_type": "image/png"}
    ]
)

print(interaction.outputs[-1].text)

JavaScript

import { GoogleGenAI } from '@google/genai';
import * as fs from 'fs';

const client = new GoogleGenAI({});

const base64Image = fs.readFileSync('car.png', { encoding: 'base64' });

const interaction = await client.interactions.create({
    model: 'gemini-2.5-flash',
    input: [
        { type: 'text', text: 'Describe the image.' },
        { type: 'image', data: base64Image, mime_type: 'image/png' }
    ]
});

console.log(interaction.outputs[interaction.outputs.length - 1].text);

REST

curl -X POST "https://generativelanguage.googleapis.com/v1beta/interactions" \
-H "Content-Type: application/json" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-d '{
    "model": "gemini-2.5-flash",
    "input": [
        {"type": "text", "text": "Describe the image."},
        {"type": "image", "data": "'"$(base64 -w0 car.png)"'", "mime_type": "image/png"}
    ]
}'

Comprensione dell'audio

Python

import base64
from pathlib import Path
from google import genai

client = genai.Client()

# Read and encode the audio
with open(Path(__file__).parent / "speech.wav", "rb") as f:
    base64_audio = base64.b64encode(f.read()).decode('utf-8')

interaction = client.interactions.create(
    model="gemini-2.5-flash",
    input=[
        {"type": "text", "text": "What does this audio say?"},
        {"type": "audio", "data": base64_audio, "mime_type": "audio/wav"}
    ]
)

print(interaction.outputs[-1].text)

JavaScript

import { GoogleGenAI } from '@google/genai';
import * as fs from 'fs';

const client = new GoogleGenAI({});

const base64Audio = fs.readFileSync('speech.wav', { encoding: 'base64' });

const interaction = await client.interactions.create({
    model: 'gemini-2.5-flash',
    input: [
        { type: 'text', text: 'What does this audio say?' },
        { type: 'audio', data: base64Audio, mime_type: 'audio/wav' }
    ]
});

console.log(interaction.outputs[interaction.outputs.length - 1].text);

REST

curl -X POST "https://generativelanguage.googleapis.com/v1beta/interactions" \
-H "Content-Type: application/json" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-d '{
    "model": "gemini-2.5-flash",
    "input": [
        {"type": "text", "text": "What does this audio say?"},
        {"type": "audio", "data": "'"$(base64 -w0 speech.wav)"'", "mime_type": "audio/wav"}
    ]
}'

Comprensione dei video

Python

import base64
from pathlib import Path
from google import genai

client = genai.Client()

# Read and encode the video
with open(Path(__file__).parent / "video.mp4", "rb") as f:
    base64_video = base64.b64encode(f.read()).decode('utf-8')

print("Analyzing video...")
interaction = client.interactions.create(
    model="gemini-2.5-flash",
    input=[
        {"type": "text", "text": "What is happening in this video? Provide a timestamped summary."},
        {"type": "video", "data": base64_video, "mime_type": "video/mp4" }
    ]
)

print(interaction.outputs[-1].text)

JavaScript

import { GoogleGenAI } from '@google/genai';
import * as fs from 'fs';

const client = new GoogleGenAI({});

const base64Video = fs.readFileSync('video.mp4', { encoding: 'base64' });

console.log('Analyzing video...');
const interaction = await client.interactions.create({
    model: 'gemini-2.5-flash',
    input: [
        { type: 'text', text: 'What is happening in this video? Provide a timestamped summary.' },
        { type: 'video', data: base64Video, mime_type: 'video/mp4'}
    ]
});

console.log(interaction.outputs[interaction.outputs.length - 1].text);

REST

curl -X POST "https://generativelanguage.googleapis.com/v1beta/interactions" \
-H "Content-Type: application/json" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-d '{
    "model": "gemini-2.5-flash",
    "input": [
        {"type": "text", "text": "What is happening in this video?"},
        {"type": "video", "mime_type": "video/mp4", "data": "'"$(base64 -w0 video.mp4)"'"}
    ]
}'

Comprensione dei documenti (PDF)

Python

import base64
from google import genai

client = genai.Client()

with open("sample.pdf", "rb") as f:
    base64_pdf = base64.b64encode(f.read()).decode('utf-8')

interaction = client.interactions.create(
    model="gemini-2.5-flash",
    input=[
        {"type": "text", "text": "What is this document about?"},
        {"type": "document", "data": base64_pdf, "mime_type": "application/pdf"}
    ]
)
print(interaction.outputs[-1].text)

JavaScript

import { GoogleGenAI } from '@google/genai';
import * as fs from 'fs';
const client = new GoogleGenAI({});

const base64Pdf = fs.readFileSync('sample.pdf', { encoding: 'base64' });

const interaction = await client.interactions.create({
    model: 'gemini-2.5-flash',
    input: [
        { type: 'text', text: 'What is this document about?' },
        { type: 'document', data: base64Pdf, mime_type: 'application/pdf' }
    ],
});
console.log(interaction.outputs[0].text);

REST

curl -X POST "https://generativelanguage.googleapis.com/v1beta/interactions" \
-H "Content-Type: application/json" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-d '{
    "model": "gemini-2.5-flash",
    "input": [
        {"type": "text", "text": "What is this document about?"},
        {"type": "document", "data": "'"$(base64 -w0 sample.pdf)"'", "mime_type": "application/pdf"}
    ]
}'

Generazione multimodale

Puoi utilizzare l'API Interactions per generare output multimodali.

Generazione di immagini

Python

import base64
from google import genai

client = genai.Client()

interaction = client.interactions.create(
    model="gemini-3-pro-image-preview",
    input="Generate an image of a futuristic city.",
    response_modalities=["IMAGE"]
)

for output in interaction.outputs:
    if output.type == "image":
        print(f"Generated image with mime_type: {output.mime_type}")
        # Save the image
        with open("generated_city.png", "wb") as f:
            f.write(base64.b64decode(output.data))

JavaScript

import { GoogleGenAI } from '@google/genai';
import * as fs from 'fs';

const client = new GoogleGenAI({});

const interaction = await client.interactions.create({
    model: 'gemini-3-pro-image-preview',
    input: 'Generate an image of a futuristic city.',
    response_modalities: ['IMAGE']
});

for (const output of interaction.outputs) {
    if (output.type === 'image') {
        console.log(`Generated image with mime_type: ${output.mime_type}`);
        // Save the image
        fs.writeFileSync('generated_city.png', Buffer.from(output.data, 'base64'));
    }
}

REST

curl -X POST "https://generativelanguage.googleapis.com/v1beta/interactions" \
-H "Content-Type: application/json" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-d '{
    "model": "gemini-3-pro-image-preview",
    "input": "Generate an image of a futuristic city.",
    "response_modalities": ["IMAGE"]
}'

Funzionalità agentiche

L'API Interactions è progettata per creare agenti e interagire con loro e include il supporto per la chiamata di funzioni, strumenti integrati, output strutturati e il Model Context Protocol (MCP).

Agenti

Puoi utilizzare agenti specializzati come deep-research-pro-preview-12-2025 per attività complesse. Per scoprire di più sull'agente Gemini Deep Research, consulta la guida Deep Research.

Python

import time
from google import genai

client = genai.Client()

# 1. Start the Deep Research Agent
initial_interaction = client.interactions.create(
    input="Research the history of the Google TPUs with a focus on 2025 and 2026.",
    agent="deep-research-pro-preview-12-2025",
    background=True
)

print(f"Research started. Interaction ID: {initial_interaction.id}")

# 2. Poll for results
while True:
    interaction = client.interactions.get(initial_interaction.id)
    print(f"Status: {interaction.status}")

    if interaction.status == "completed":
        print("\nFinal Report:\n", interaction.outputs[-1].text)
        break
    elif interaction.status in ["failed", "cancelled"]:
        print(f"Failed with status: {interaction.status}")
        break

    time.sleep(10)

JavaScript

import { GoogleGenAI } from '@google/genai';

const client = new GoogleGenAI({});

// 1. Start the Deep Research Agent
const initialInteraction = await client.interactions.create({
    input: 'Research the history of the Google TPUs with a focus on 2025 and 2026.',
    agent: 'deep-research-pro-preview-12-2025',
    background: true
});

console.log(`Research started. Interaction ID: ${initialInteraction.id}`);

// 2. Poll for results
while (true) {
    const interaction = await client.interactions.get(initialInteraction.id);
    console.log(`Status: ${interaction.status}`);

    if (interaction.status === 'completed') {
        console.log('\nFinal Report:\n', interaction.outputs[interaction.outputs.length - 1].text);
        break;
    } else if (['failed', 'cancelled'].includes(interaction.status)) {
        console.log(`Failed with status: ${interaction.status}`);
        break;
    }

    await new Promise(resolve => setTimeout(resolve, 10000));
}

REST

# 1. Start the Deep Research Agent
curl -X POST "https://generativelanguage.googleapis.com/v1beta/interactions" \
-H "Content-Type: application/json" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-d '{
    "input": "Research the history of the Google TPUs with a focus on 2025 and 2026.",
    "agent": "deep-research-pro-preview-12-2025",
    "background": true
}'

# 2. Poll for results (Replace INTERACTION_ID with the ID from the previous interaction)
# curl -X GET "https://generativelanguage.googleapis.com/v1beta/interactions/INTERACTION_ID" \
# -H "x-goog-api-key: $GEMINI_API_KEY"

Strumenti e chiamata di funzione

Questa sezione spiega come utilizzare la chiamata di funzioni per definire strumenti personalizzati e come utilizzare gli strumenti integrati di Google all'interno dell'API Interactions.

Chiamata di funzione

Python

from google import genai

client = genai.Client()

# 1. Define the tool
def get_weather(location: str):
    """Gets the weather for a given location."""
    return f"The weather in {location} is sunny."

weather_tool = {
    "type": "function",
    "name": "get_weather",
    "description": "Gets the weather for a given location.",
    "parameters": {
        "type": "object",
        "properties": {
            "location": {"type": "string", "description": "The city and state, e.g. San Francisco, CA"}
        },
        "required": ["location"]
    }
}

# 2. Send the request with tools
interaction = client.interactions.create(
    model="gemini-2.5-flash",
    input="What is the weather in Paris?",
    tools=[weather_tool]
)

# 3. Handle the tool call
for output in interaction.outputs:
    if output.type == "function_call":
        print(f"Tool Call: {output.name}({output.arguments})")
        # Execute tool
        result = get_weather(**output.arguments)

        # Send result back
        interaction = client.interactions.create(
            model="gemini-2.5-flash",
            previous_interaction_id=interaction.id,
            input=[{
                "type": "function_result",
                "name": output.name,
                "call_id": output.id,
                "result": result
            }]
        )
        print(f"Response: {interaction.outputs[-1].text}")

JavaScript

import { GoogleGenAI } from '@google/genai';

const client = new GoogleGenAI({});

// 1. Define the tool
const weatherTool = {
    type: 'function',
    name: 'get_weather',
    description: 'Gets the weather for a given location.',
    parameters: {
        type: 'object',
        properties: {
            location: { type: 'string', description: 'The city and state, e.g. San Francisco, CA' }
        },
        required: ['location']
    }
};

// 2. Send the request with tools
let interaction = await client.interactions.create({
    model: 'gemini-2.5-flash',
    input: 'What is the weather in Paris?',
    tools: [weatherTool]
});

// 3. Handle the tool call
for (const output of interaction.outputs) {
    if (output.type === 'function_call') {
        console.log(`Tool Call: ${output.name}(${JSON.stringify(output.arguments)})`);

        // Execute tool (Mocked)
        const result = `The weather in ${output.arguments.location} is sunny.`;

        // Send result back
        interaction = await client.interactions.create({
            model: 'gemini-2.5-flash',
            previous_interaction_id: interaction.id,
            input: [{
                type: 'function_result',
                name: output.name,
                call_id: output.id,
                result: result
            }]
        });
        console.log(`Response: ${interaction.outputs[interaction.outputs.length - 1].text}`);
    }
}

REST

curl -X POST "https://generativelanguage.googleapis.com/v1beta/interactions" \
-H "Content-Type: application/json" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-d '{
    "model": "gemini-2.5-flash",
    "input": "What is the weather in Paris?",
    "tools": [{
        "type": "function",
        "name": "get_weather",
        "description": "Gets the weather for a given location.",
        "parameters": {
            "type": "object",
            "properties": {
                "location": {"type": "string", "description": "The city and state, e.g. San Francisco, CA"}
            },
            "required": ["location"]
        }
    }]
}'

# Handle the tool call and send result back (Replace INTERACTION_ID and CALL_ID)
# curl -X POST "https://generativelanguage.googleapis.com/v1beta/interactions" \
# -H "Content-Type: application/json" \
# -H "x-goog-api-key: $GEMINI_API_KEY" \
# -d '{
#     "model": "gemini-2.5-flash",
#     "previous_interaction_id": "INTERACTION_ID",
#     "input": [{
#         "type": "function_result",
#         "name": "get_weather",
#         "call_id": "FUNCTION_CALL_ID",
#         "result": "The weather in Paris is sunny."
#     }]
# }'
Chiamata di funzione con stato lato client

Se non vuoi utilizzare lo stato lato server, puoi gestire tutto lato client.

Python

from google import genai
client = genai.Client()

functions = [
    {
        "type": "function",
        "name": "schedule_meeting",
        "description": "Schedules a meeting with specified attendees at a given time and date.",
        "parameters": {
            "type": "object",
            "properties": {
                "attendees": {"type": "array", "items": {"type": "string"}},
                "date": {"type": "string", "description": "Date of the meeting (e.g., 2024-07-29)"},
                "time": {"type": "string", "description": "Time of the meeting (e.g., 15:00)"},
                "topic": {"type": "string", "description": "The subject of the meeting."},
            },
            "required": ["attendees", "date", "time", "topic"],
        },
    }
]

history = [{"role": "user","content": [{"type": "text", "text": "Schedule a meeting for 2025-11-01 at 10 am with Peter and Amir about the Next Gen API."}]}]

# 1. Model decides to call the function
interaction = client.interactions.create(
    model="gemini-2.5-flash",
    input=history,
    tools=functions
)

# add model interaction back to history
history.append({"role": "model", "content": interaction.outputs})

for output in interaction.outputs:
    if output.type == "function_call":
        print(f"Function call: {output.name} with arguments {output.arguments}")

        # 2. Execute the function and get a result
        # In a real app, you would call your function here.
        # call_result = schedule_meeting(**json.loads(output.arguments))
        call_result = "Meeting scheduled successfully."

        # 3. Send the result back to the model
        history.append({"role": "user", "content": [{"type": "function_result", "name": output.name, "call_id": output.id, "result": call_result}]})

        interaction2 = client.interactions.create(
            model="gemini-2.5-flash",
            input=history,
        )
        print(f"Final response: {interaction2.outputs[-1].text}")
    else:
        print(f"Output: {output}")

JavaScript

// 1. Define the tool
const functions = [
    {
        type: 'function',
        name: 'schedule_meeting',
        description: 'Schedules a meeting with specified attendees at a given time and date.',
        parameters: {
            type: 'object',
            properties: {
                attendees: { type: 'array', items: { type: 'string' } },
                date: { type: 'string', description: 'Date of the meeting (e.g., 2024-07-29)' },
                time: { type: 'string', description: 'Time of the meeting (e.g., 15:00)' },
                topic: { type: 'string', description: 'The subject of the meeting.' },
            },
            required: ['attendees', 'date', 'time', 'topic'],
        },
    },
];

const history = [
    { role: 'user', content: [{ type: 'text', text: 'Schedule a meeting for 2025-11-01 at 10 am with Peter and Amir about the Next Gen API.' }] }
];

// 2. Model decides to call the function
let interaction = await client.interactions.create({
    model: 'gemini-2.5-flash',
    input: history,
    tools: functions
});

// add model interaction back to history
history.push({ role: 'model', content: interaction.outputs });

for (const output of interaction.outputs) {
    if (output.type === 'function_call') {
        console.log(`Function call: ${output.name} with arguments ${JSON.stringify(output.arguments)}`);

        // 3. Send the result back to the model
        history.push({ role: 'user', content: [{ type: 'function_result', name: output.name, call_id: output.id, result: 'Meeting scheduled successfully.' }] });

        const interaction2 = await client.interactions.create({
            model: 'gemini-2.5-flash',
            input: history,
        });
        console.log(`Final response: ${interaction2.outputs[interaction2.outputs.length - 1].text}`);
    }
}

Strumenti integrati

Gemini include strumenti integrati come Grounding con la Ricerca Google, Esecuzione del codice e Contesto URL.

Python

from google import genai

client = genai.Client()

interaction = client.interactions.create(
    model="gemini-2.5-flash",
    input="Who won the last Super Bowl?",
    tools=[{"type": "google_search"}]
)
# Find the text output (not the GoogleSearchResultContent)
text_output = next((o for o in interaction.outputs if o.type == "text"), None)
if text_output:
    print(text_output.text)

JavaScript

import { GoogleGenAI } from '@google/genai';

const client = new GoogleGenAI({});

const interaction = await client.interactions.create({
    model: 'gemini-2.5-flash',
    input: 'Who won the last Super Bowl?',
    tools: [{ type: 'google_search' }]
});
// Find the text output (not the GoogleSearchResultContent)
const textOutput = interaction.outputs.find(o => o.type === 'text');
if (textOutput) console.log(textOutput.text);

REST

curl -X POST "https://generativelanguage.googleapis.com/v1beta/interactions" \
-H "Content-Type: application/json" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-d '{
    "model": "gemini-2.5-flash",
    "input": "Who won the last Super Bowl?",
    "tools": [{"type": "google_search"}]
}'
Esecuzione del codice

Python

from google import genai

client = genai.Client()

interaction = client.interactions.create(
    model="gemini-2.5-flash",
    input="Calculate the 50th Fibonacci number.",
    tools=[{"type": "code_execution"}]
)
print(interaction.outputs[-1].text)

JavaScript

import { GoogleGenAI } from '@google/genai';

const client = new GoogleGenAI({});

const interaction = await client.interactions.create({
    model: 'gemini-2.5-flash',
    input: 'Calculate the 50th Fibonacci number.',
    tools: [{ type: 'code_execution' }]
});
console.log(interaction.outputs[interaction.outputs.length - 1].text);

REST

curl -X POST "https://generativelanguage.googleapis.com/v1beta/interactions" \
-H "Content-Type: application/json" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-d '{
    "model": "gemini-2.5-flash",
    "input": "Calculate the 50th Fibonacci number.",
    "tools": [{"type": "code_execution"}]
}'
Contesto URL

Python

from google import genai

client = genai.Client()

interaction = client.interactions.create(
    model="gemini-2.5-flash",
    input="Summarize the content of https://www.wikipedia.org/",
    tools=[{"type": "url_context"}]
)
# Find the text output (not the URLContextResultContent)
text_output = next((o for o in interaction.outputs if o.type == "text"), None)
if text_output:
    print(text_output.text)

JavaScript

import { GoogleGenAI } from '@google/genai';

const client = new GoogleGenAI({});

const interaction = await client.interactions.create({
    model: 'gemini-2.5-flash',
    input: 'Summarize the content of https://www.wikipedia.org/',
    tools: [{ type: 'url_context' }]
});
// Find the text output (not the URLContextResultContent)
const textOutput = interaction.outputs.find(o => o.type === 'text');
if (textOutput) console.log(textOutput.text);

REST

curl -X POST "https://generativelanguage.googleapis.com/v1beta/interactions" \
-H "Content-Type: application/json" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-d '{
    "model": "gemini-2.5-flash",
    "input": "Summarize the content of https://www.wikipedia.org/",
    "tools": [{"type": "url_context"}]
}'

Protocollo di contesto del modello (MCP) remoto

L'integrazione MCP� remota semplifica lo sviluppo di agenti consentendo all'API Gemini di chiamare direttamente strumenti esterni ospitati su server remoti.

Python

from google import genai

client = genai.Client()

mcp_server = {
    "type": "mcp_server",
    "name": "weather_service",
    "url": "https://gemini-api-demos.uc.r.appspot.com/mcp"
}

interaction = client.interactions.create(
    model="gemini-2.5-flash",
    input="What is the weather like in New York today?",
    tools=[mcp_server]
)

print(interaction.outputs[-1].text)

JavaScript

import { GoogleGenAI } from '@google/genai';

const client = new GoogleGenAI({});

const mcpServer = {
    type: 'mcp_server',
    name: 'weather_service',
    url: 'https://gemini-api-demos.uc.r.appspot.com/mcp'
};

const interaction = await client.interactions.create({
    model: 'gemini-2.5-flash',
    input: 'What is the weather like in New York today?',
    tools: [mcpServer]
});

console.log(interaction.outputs[interaction.outputs.length - 1].text);

REST

curl -X POST "https://generativelanguage.googleapis.com/v1beta/interactions" \
-H "Content-Type: application/json" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-d '{
    "model": "gemini-2.5-flash",
    "input": "What is the weather like in New York today?",
    "tools": [{
        "type": "mcp_server",
        "name": "weather_service",
        "url": "https://gemini-api-demos.uc.r.appspot.com/mcp"
    }]
}'

Output strutturato (schema JSON)

Forza un output JSON specifico fornendo uno schema JSON nel parametro response_format. È utile per attività come la moderazione, la classificazione o l'estrazione di dati.

Python

from google import genai
from pydantic import BaseModel, Field
from typing import Literal, Union
client = genai.Client()

class SpamDetails(BaseModel):
    reason: str = Field(description="The reason why the content is considered spam.")
    spam_type: Literal["phishing", "scam", "unsolicited promotion", "other"]

class NotSpamDetails(BaseModel):
    summary: str = Field(description="A brief summary of the content.")
    is_safe: bool = Field(description="Whether the content is safe for all audiences.")

class ModerationResult(BaseModel):
    decision: Union[SpamDetails, NotSpamDetails]

interaction = client.interactions.create(
    model="gemini-2.5-flash",
    input="Moderate the following content: 'Congratulations! You've won a free cruise. Click here to claim your prize: www.definitely-not-a-scam.com'",
    response_format=ModerationResult.model_json_schema(),
)

parsed_output = ModerationResult.model_validate_json(interaction.outputs[-1].text)
print(parsed_output)

JavaScript

import { GoogleGenAI } from '@google/genai';
import { z } from 'zod';
const client = new GoogleGenAI({});

const moderationSchema = z.object({
    decision: z.union([
        z.object({
            reason: z.string().describe('The reason why the content is considered spam.'),
            spam_type: z.enum(['phishing', 'scam', 'unsolicited promotion', 'other']).describe('The type of spam.'),
        }).describe('Details for content classified as spam.'),
        z.object({
            summary: z.string().describe('A brief summary of the content.'),
            is_safe: z.boolean().describe('Whether the content is safe for all audiences.'),
        }).describe('Details for content classified as not spam.'),
    ]),
});

const interaction = await client.interactions.create({
    model: 'gemini-2.5-flash',
    input: "Moderate the following content: 'Congratulations! You've won a free cruise. Click here to claim your prize: www.definitely-not-a-scam.com'",
    response_format: z.toJSONSchema(moderationSchema),
});
console.log(interaction.outputs[0].text);

REST

curl -X POST "https://generativelanguage.googleapis.com/v1beta/interactions" \
-H "Content-Type: application/json" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-d '{
    "model": "gemini-2.5-flash",
    "input": "Moderate the following content: 'Congratulations! You've won a free cruise. Click here to claim your prize: www.definitely-not-a-scam.com'",
    "response_format": {
        "type": "object",
        "properties": {
            "decision": {
                "type": "object",
                "properties": {
                    "reason": {"type": "string", "description": "The reason why the content is considered spam."},
                    "spam_type": {"type": "string", "description": "The type of spam."}
                },
                "required": ["reason", "spam_type"]
            }
        },
        "required": ["decision"]
    }
}'

Combinare strumenti e output strutturato

Combina gli strumenti integrati con l'output strutturato per ottenere un oggetto JSON affidabile basato sulle informazioni recuperate da uno strumento.

Python

from google import genai
from pydantic import BaseModel, Field
from typing import Literal, Union

client = genai.Client()

class SpamDetails(BaseModel):
    reason: str = Field(description="The reason why the content is considered spam.")
    spam_type: Literal["phishing", "scam", "unsolicited promotion", "other"]

class NotSpamDetails(BaseModel):
    summary: str = Field(description="A brief summary of the content.")
    is_safe: bool = Field(description="Whether the content is safe for all audiences.")

class ModerationResult(BaseModel):
    decision: Union[SpamDetails, NotSpamDetails]

interaction = client.interactions.create(
    model="gemini-3-pro-preview",
    input="Moderate the following content: 'Congratulations! You've won a free cruise. Click here to claim your prize: www.definitely-not-a-scam.com'",
    response_format=ModerationResult.model_json_schema(),
    tools=[{"type": "url_context"}]
)

parsed_output = ModerationResult.model_validate_json(interaction.outputs[-1].text)
print(parsed_output)

JavaScript

import { GoogleGenAI } from '@google/genai';
import { z } from 'zod'; // Assuming zod is used for schema generation, or define manually
const client = new GoogleGenAI({});

const obj = z.object({
    winning_team: z.string(),
    score: z.string(),
});
const schema = z.toJSONSchema(obj);

const interaction = await client.interactions.create({
    model: 'gemini-3-pro-preview',
    input: 'Who won the last euro?',
    tools: [{ type: 'google_search' }],
    response_format: schema,
});
console.log(interaction.outputs[0].text);

REST

curl -X POST "https://generativelanguage.googleapis.com/v1beta/interactions" \
-H "Content-Type: application/json" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-d '{
    "model": "gemini-3-pro-preview",
    "input": "Who won the last euro?",
    "tools": [{"type": "google_search"}],
    "response_format": {
        "type": "object",
        "properties": {
            "winning_team": {"type": "string"},
            "score": {"type": "string"}
        }
    }
}'

Funzionalità avanzate

Sono disponibili anche funzionalità avanzate aggiuntive che offrono maggiore flessibilità nell'utilizzo dell'API Interactions.

Streaming

Ricevere le risposte in modo incrementale man mano che vengono generate.

Python

from google import genai

client = genai.Client()

stream = client.interactions.create(
    model="gemini-2.5-flash",
    input="Explain quantum entanglement in simple terms.",
    stream=True
)

for chunk in stream:
    if chunk.event_type == "content.delta":
        if chunk.delta.type == "text":
            print(chunk.delta.text, end="", flush=True)
        elif chunk.delta.type == "thought":
            print(chunk.delta.thought, end="", flush=True)
    elif chunk.event_type == "interaction.complete":
        print(f"\n\n--- Stream Finished ---")
        print(f"Total Tokens: {chunk.interaction.usage.total_tokens}")

JavaScript

import { GoogleGenAI } from '@google/genai';

const client = new GoogleGenAI({});

const stream = await client.interactions.create({
    model: 'gemini-2.5-flash',
    input: 'Explain quantum entanglement in simple terms.',
    stream: true,
});

for await (const chunk of stream) {
    if (chunk.event_type === 'content.delta') {
        if (chunk.delta.type === 'text' && 'text' in chunk.delta) {
            process.stdout.write(chunk.delta.text);
        } else if (chunk.delta.type === 'thought' && 'thought' in chunk.delta) {
            process.stdout.write(chunk.delta.thought);
        }
    } else if (chunk.event_type === 'interaction.complete') {
        console.log('\n\n--- Stream Finished ---');
        console.log(`Total Tokens: ${chunk.interaction.usage.total_tokens}`);
    }
}

REST

curl -X POST "https://generativelanguage.googleapis.com/v1beta/interactions?alt=sse" \
-H "Content-Type: application/json" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-d '{
    "model": "gemini-2.5-flash",
    "input": "Explain quantum entanglement in simple terms.",
    "stream": true
}'

Configurazione

Personalizza il comportamento del modello con generation_config.

Python

from google import genai

client = genai.Client()

interaction = client.interactions.create(
    model="gemini-2.5-flash",
    input="Tell me a story about a brave knight.",
    generation_config={
        "temperature": 0.7,
        "max_output_tokens": 500,
        "thinking_level": "low",
    }
)

print(interaction.outputs[-1].text)

JavaScript

import { GoogleGenAI } from '@google/genai';

const client = new GoogleGenAI({});

const interaction = await client.interactions.create({
    model: 'gemini-2.5-flash',
    input: 'Tell me a story about a brave knight.',
    generation_config: {
        temperature: 0.7,
        max_output_tokens: 500,
        thinking_level: 'low',
    }
});

console.log(interaction.outputs[interaction.outputs.length - 1].text);

REST

curl -X POST "https://generativelanguage.googleapis.com/v1beta/interactions" \
-H "Content-Type: application/json" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-d '{
    "model": "gemini-2.5-flash",
    "input": "Tell me a story about a brave knight.",
    "generation_config": {
        "temperature": 0.7,
        "max_output_tokens": 500,
        "thinking_level": "low"
    }
}'

Utilizzo dei file

Utilizzare i file remoti

Accedi ai file utilizzando gli URL remoti direttamente nella chiamata API.

Python

from google import genai
client = genai.Client()

interaction = client.interactions.create(
    model="gemini-2.5-flash",
    input=[
        {
            "type": "image",
            "uri": "https://github.com/<github-path>/cats-and-dogs.jpg",
        },
        {"type": "text", "text": "Describe what you see."}
    ],
)
for output in interaction.outputs:
    if output.type == "text":
        print(output.text)

JavaScript

import { GoogleGenAI } from '@google/genai';
const client = new GoogleGenAI({});

const interaction = await client.interactions.create({
    model: 'gemini-2.5-flash',
    input: [
        {
            type: 'image',
            uri: 'https://github.com/<github-path>/cats-and-dogs.jpg',
        },
        { type: 'text', text: 'Describe what you see.' }
    ],
});
for (const output of interaction.outputs) {
    if (output.type === 'text') {
        console.log(output.text);
    }
}

REST

curl -X POST "https://generativelanguage.googleapis.com/v1beta/interactions" \
-H "Content-Type: application/json" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-d '{
    "model": "gemini-2.5-flash",
    "input": [
        {
            "type": "image",
            "uri": "https://github.com/<github-path>/cats-and-dogs.jpg"
        },
        {"type": "text", "text": "Describe what you see."}
    ]
}'

Utilizzo dell'API Gemini Files

Carica i file nell'API Files di Gemini prima di utilizzarli.

Python

from google import genai
import time
import requests
client = genai.Client()

# 1. Download the file
url = "https://github.com/philschmid/gemini-samples/raw/refs/heads/main/assets/cats-and-dogs.jpg"
response = requests.get(url)
with open("cats-and-dogs.jpg", "wb") as f:
    f.write(response.content)

# 2. Upload to Gemini Files API
file = client.files.upload(file="cats-and-dogs.jpg")

# 3. Wait for processing
while client.files.get(name=file.name).state != "ACTIVE":
    time.sleep(2)

# 4. Use in Interaction
interaction = client.interactions.create(
    model="gemini-2.5-flash",
    input=[
        {
            "type": "image",
            "uri": file.uri,
        },
        {"type": "text", "text": "Describe what you see."}
    ],
)
for output in interaction.outputs:
    if output.type == "text":
        print(output.text)

JavaScript

import { GoogleGenAI } from '@google/genai';
import * as fs from 'fs';
import fetch from 'node-fetch';
const client = new GoogleGenAI({});

// 1. Download the file
const url = 'https://github.com/philschmid/gemini-samples/raw/refs/heads/main/assets/cats-and-dogs.jpg';
const filename = 'cats-and-dogs.jpg';
const response = await fetch(url);
const buffer = await response.buffer();
fs.writeFileSync(filename, buffer);

// 2. Upload to Gemini Files API
const myfile = await client.files.upload({ file: filename, config: { mimeType: 'image/jpeg' } });

// 3. Wait for processing
while ((await client.files.get({ name: myfile.name })).state !== 'ACTIVE') {
    await new Promise(resolve => setTimeout(resolve, 2000));
}

// 4. Use in Interaction
const interaction = await client.interactions.create({
    model: 'gemini-2.5-flash',
    input: [
        { type: 'image', uri: myfile.uri, },
        { type: 'text', text: 'Describe what you see.' }
    ],
});
for (const output of interaction.outputs) {
    if (output.type === 'text') {
        console.log(output.text);
    }
}

REST

# 1. Upload the file (Requires File API setup)
# See https://ai.google.dev/gemini-api/docs/files for details.
# Assume FILE_URI is obtained from the upload step.

curl -X POST "https://generativelanguage.googleapis.com/v1beta/interactions" \
-H "Content-Type: application/json" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-d '{
    "model": "gemini-2.5-flash",
    "input": [
        {"type": "image", "uri": "FILE_URI"},
        {"type": "text", "text": "Describe what you see."}
    ]
}'

Modello dei dati

Puoi scoprire di più sul modello di dati nella documentazione di riferimento dell'API. Di seguito è riportata una panoramica generale dei componenti principali.

Interazione

Proprietà Tipo Descrizione
id string Identificatore univoco dell'interazione.
model/agent string Il modello o l'agente utilizzato. Puoi specificarne solo uno.
input Content[] Gli input forniti.
outputs Content[] Le risposte del modello.
tools Tool[] Gli strumenti utilizzati.
previous_interaction_id string ID dell'interazione precedente per il contesto.
stream boolean Indica se l'interazione è in streaming.
status string Stato: completed, in_progress, requires_action,failed e così via.
background boolean Indica se l'interazione è in modalità in background.
store boolean Se memorizzare l'interazione. Predefinito: true. Imposta su false per disattivare l'opzione.
usage Utilizzo Utilizzo dei token della richiesta di interazione.

Modelli e agenti supportati

Nome modello Tipo ID modello
Gemini 2.5 Pro Modello gemini-2.5-pro
Gemini 2.5 Flash Modello gemini-2.5-flash
Gemini 2.5 Flash-lite Modello gemini-2.5-flash-lite
Anteprima di Gemini 3 Pro Modello gemini-3-pro-preview
Anteprima di Deep Research Agente deep-research-pro-preview-12-2025

Come funziona l'API Interactions

L'API Interactions è progettata intorno a una risorsa centrale: l'Interaction. Un Interaction rappresenta un turno completo in una conversazione o un'attività. Funge da record di sessione e contiene l'intera cronologia di un'interazione, inclusi tutti gli input dell'utente, i pensieri del modello, le chiamate agli strumenti, i risultati degli strumenti e gli output finali del modello.

Quando effettui una chiamata a interactions.create, stai creando una nuova risorsa Interaction.

Se vuoi, puoi utilizzare l'id di questa risorsa in una chiamata successiva utilizzando il parametro previous_interaction_id per continuare la conversazione. Il server utilizza questo ID per recuperare il contesto completo, evitando di dover inviare nuovamente l'intera cronologia chat. La gestione dello stato lato server è facoltativa. Puoi anche operare in modalità stateless inviando la cronologia completa della conversazione in ogni richiesta.

Archiviazione e conservazione dei dati

Per impostazione predefinita, tutti gli oggetti Interaction vengono archiviati (store=true) per semplificare l'utilizzo delle funzionalità di gestione dello stato lato server (con previous_interaction_id), l'esecuzione in background (utilizzando background=true) e per scopi di osservabilità.

  • Livello a pagamento: le interazioni vengono conservate per 55 giorni.
  • Livello senza costi: le interazioni vengono conservate per 1 giorno.

Se non vuoi che questo accada, puoi impostare store=false nella tua richiesta. Questo controllo è separato dalla gestione dello stato; puoi disattivare l'archiviazione per qualsiasi interazione. Tuttavia, tieni presente che store=false non è compatibile con background=true e impedisce l'utilizzo di previous_interaction_id per i turni successivi.

Puoi eliminare le interazioni memorizzate in qualsiasi momento utilizzando il metodo di eliminazione disponibile nella guida di riferimento API. Puoi eliminare le interazioni solo se conosci l'ID interazione.

Al termine del periodo di conservazione, i dati verranno eliminati automaticamente.

Gli oggetti di interazione vengono elaborati in base ai termini.

Best practice

  • Percentuale di successi della cache: l'utilizzo di previous_interaction_id per continuare le conversazioni consente al sistema di utilizzare più facilmente la memorizzazione nella cache implicita per la cronologia delle conversazioni, il che migliora le prestazioni e riduce i costi.
  • Interazioni di mixaggio: hai la flessibilità di combinare le interazioni di Agente e Modello all'interno di una conversazione. Ad esempio, puoi utilizzare un agente specializzato, come l'agente Deep Research, per la raccolta iniziale dei dati, quindi utilizzare un modello Gemini standard per le attività di follow-up, come il riepilogo o la riformattazione, collegando questi passaggi con previous_interaction_id.

SDK

Puoi utilizzare l'ultima versione degli SDK Google GenAI per accedere all'API Interactions.

  • In Python, questo è il pacchetto google-genai dalla versione 1.55.0 in poi.
  • In JavaScript, questo è il pacchetto @google/genai dalla versione 1.33.0 in poi.

Puoi scoprire di più su come installare gli SDK nella pagina Librerie.

Limitazioni

  • Stato beta: l'API Interactions è in versione beta/anteprima. Funzionalità e schemi possono cambiare.
  • Funzionalità non supportate: Le seguenti funzionalità non sono ancora supportate, ma saranno disponibili a breve:

  • Ordine di output: l'ordine dei contenuti per gli strumenti integrati (google_search e url_context) a volte potrebbe essere errato, con il testo che appare prima dell'esecuzione e del risultato dello strumento. Si tratta di un problema noto e la correzione è in corso.

  • Combinazioni di strumenti: la combinazione di MCP, Chiamata di funzione e strumenti integrati non è ancora supportata, ma lo sarà a breve.

  • MCP remoto: Gemini 3 non supporta l'MCP remoto, ma questa funzionalità sarà disponibile a breve.

Modifiche che provocano un errore

L'API Interactions è attualmente in fase beta preliminare. Stiamo sviluppando e perfezionando attivamente le funzionalità delle API, gli schemi delle risorse e le interfacce SDK in base all'utilizzo reale e al feedback degli sviluppatori.

Di conseguenza, potrebbero verificarsi modifiche che causano interruzioni. Gli aggiornamenti possono includere modifiche a:

  • Schemi per input e output.
  • Firme dei metodi SDK e strutture degli oggetti.
  • Comportamenti specifici delle funzionalità.

Per i workload di produzione, devi continuare a utilizzare l'API standard generateContent. Rimane il percorso consigliato per i deployment stabili e continuerà a essere sviluppato e gestito attivamente.

Feedback

Il tuo feedback è fondamentale per lo sviluppo dell'API Interactions. Condividi le tue opinioni, segnala bug o richiedi funzionalità nel nostro forum della community di Google AI Developer.

Passaggi successivi