API Interactions

L'API Interactions (bêta) est une interface unifiée permettant d'interagir avec les modèles et les agents Gemini. Alternative améliorée à l'API generateContent, elle simplifie la gestion de l'état, l'orchestration des outils et les tâches de longue durée. Pour obtenir une vue complète du schéma de l'API, consultez la documentation de référence de l'API. Pendant la version bêta, les fonctionnalités et les schémas sont susceptibles de subir des modifications incompatibles. Pour commencer rapidement, essayez le notebook de démarrage rapide de l'API Interactions.

L'exemple suivant montre comment appeler l'API Interactions avec une requête textuelle.

Python

from google import genai

client = genai.Client()

interaction =  client.interactions.create(
    model="gemini-3-flash-preview",
    input="Tell me a short joke about programming."
)

print(interaction.outputs[-1].text)

JavaScript

import { GoogleGenAI } from '@google/genai';

const client = new GoogleGenAI({});

const interaction =  await client.interactions.create({
    model: 'gemini-3-flash-preview',
    input: 'Tell me a short joke about programming.',
});

console.log(interaction.outputs[interaction.outputs.length - 1].text);

REST

curl -X POST "https://generativelanguage.googleapis.com/v1beta/interactions" \
-H "Content-Type: application/json" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-d '{
    "model": "gemini-3-flash-preview",
    "input": "Tell me a short joke about programming."
}'

Interactions de base

L'API Interactions est disponible via nos SDK existants. Le moyen le plus simple d'interagir avec le modèle consiste à fournir un prompt textuel. input peut être une chaîne, une liste contenant des objets de contenu ou une liste de tours avec des rôles et des objets de contenu.

Python

from google import genai

client = genai.Client()

interaction =  client.interactions.create(
    model="gemini-3-flash-preview",
    input="Tell me a short joke about programming."
)

print(interaction.outputs[-1].text)

JavaScript

import { GoogleGenAI } from '@google/genai';

const client = new GoogleGenAI({});

const interaction =  await client.interactions.create({
    model: 'gemini-3-flash-preview',
    input: 'Tell me a short joke about programming.',
});

console.log(interaction.outputs[interaction.outputs.length - 1].text);

REST

curl -X POST "https://generativelanguage.googleapis.com/v1beta/interactions" \
-H "Content-Type: application/json" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-d '{
    "model": "gemini-3-flash-preview",
    "input": "Tell me a short joke about programming."
}'

Conversation

Vous pouvez créer des conversations multitours de deux manières :

Avec état, en faisant référence à une interaction précédente
Sans état, en fournissant l'intégralité de l'historique des conversations

Conversation avec état

Transmettez le id de l'interaction précédente au paramètre previous_interaction_id pour poursuivre une conversation.

Python

from google import genai

client = genai.Client()

# 1. First turn
interaction1 = client.interactions.create(
    model="gemini-3-flash-preview",
    input="Hi, my name is Phil."
)
print(f"Model: {interaction1.outputs[-1].text}")

# 2. Second turn (passing previous_interaction_id)
interaction2 = client.interactions.create(
    model="gemini-3-flash-preview",
    input="What is my name?",
    previous_interaction_id=interaction1.id
)
print(f"Model: {interaction2.outputs[-1].text}")

JavaScript

import { GoogleGenAI } from '@google/genai';

const client = new GoogleGenAI({});

// 1. First turn
const interaction1 = await client.interactions.create({
    model: 'gemini-3-flash-preview',
    input: 'Hi, my name is Phil.'
});
console.log(`Model: ${interaction1.outputs[interaction1.outputs.length - 1].text}`);

// 2. Second turn (passing previous_interaction_id)
const interaction2 = await client.interactions.create({
    model: 'gemini-3-flash-preview',
    input: 'What is my name?',
    previous_interaction_id: interaction1.id
});
console.log(`Model: ${interaction2.outputs[interaction2.outputs.length - 1].text}`);

REST

# 1. First turn
curl -X POST "https://generativelanguage.googleapis.com/v1beta/interactions" \
-H "Content-Type: application/json" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-d '{
    "model": "gemini-3-flash-preview",
    "input": "Hi, my name is Phil."
}'

# 2. Second turn (Replace INTERACTION_ID with the ID from the previous interaction)
# curl -X POST "https://generativelanguage.googleapis.com/v1beta/interactions" \
# -H "Content-Type: application/json" \
# -H "x-goog-api-key: $GEMINI_API_KEY" \
# -d '{
#     "model": "gemini-3-flash-preview",
#     "input": "What is my name?",
#     "previous_interaction_id": "INTERACTION_ID"
# }'

Récupérer les interactions avec état précédentes

Utiliser l'interaction id pour récupérer les tours de conversation précédents.

Python

previous_interaction = client.interactions.get("<YOUR_INTERACTION_ID>")

print(previous_interaction)

JavaScript

const previous_interaction = await client.interactions.get("<YOUR_INTERACTION_ID>");
console.log(previous_interaction);

REST

curl -X GET "https://generativelanguage.googleapis.com/v1beta/interactions/<YOUR_INTERACTION_ID>" \
-H "x-goog-api-key: $GEMINI_API_KEY"

Inclure l'entrée d'origine

Par défaut, interactions.get() ne renvoie que les sorties du modèle. Pour inclure l'entrée normalisée d'origine dans la réponse, définissez include_input sur true.

Python

interaction = client.interactions.get(
    "<YOUR_INTERACTION_ID>",
    include_input=True
)

print(f"Input: {interaction.input}")
print(f"Output: {interaction.outputs}")

JavaScript

const interaction = await client.interactions.get(
    "<YOUR_INTERACTION_ID>",
    { include_input: true }
);

console.log(`Input: ${JSON.stringify(interaction.input)}`);
console.log(`Output: ${JSON.stringify(interaction.outputs)}`);

REST

curl -X GET "https://generativelanguage.googleapis.com/v1beta/interactions/<YOUR_INTERACTION_ID>?include_input=true" \
-H "x-goog-api-key: $GEMINI_API_KEY"

Conversation sans état

Vous pouvez gérer manuellement l'historique des conversations côté client.

Python

from google import genai

client = genai.Client()

conversation_history = [
    {
        "role": "user",
        "content": "What are the three largest cities in Spain?"
    }
]

interaction1 = client.interactions.create(
    model="gemini-3-flash-preview",
    input=conversation_history
)

print(f"Model: {interaction1.outputs[-1].text}")

conversation_history.append({"role": "model", "content": interaction1.outputs})
conversation_history.append({
    "role": "user",
    "content": "What is the most famous landmark in the second one?"
})

interaction2 = client.interactions.create(
    model="gemini-3-flash-preview",
    input=conversation_history
)

print(f"Model: {interaction2.outputs[-1].text}")

JavaScript

import { GoogleGenAI } from '@google/genai';

const client = new GoogleGenAI({});

const conversationHistory = [
    {
        role: 'user',
        content: "What are the three largest cities in Spain?"
    }
];

const interaction1 = await client.interactions.create({
    model: 'gemini-3-flash-preview',
    input: conversationHistory
});

console.log(`Model: ${interaction1.outputs[interaction1.outputs.length - 1].text}`);

conversationHistory.push({ role: 'model', content: interaction1.outputs });
conversationHistory.push({
    role: 'user',
    content: "What is the most famous landmark in the second one?"
});

const interaction2 = await client.interactions.create({
    model: 'gemini-3-flash-preview',
    input: conversationHistory
});

console.log(`Model: ${interaction2.outputs[interaction2.outputs.length - 1].text}`);

REST

 curl -X POST "https://generativelanguage.googleapis.com/v1beta/interactions" \
 -H "Content-Type: application/json" \
 -H "x-goog-api-key: $GEMINI_API_KEY" \
 -d '{
    "model": "gemini-3-flash-preview",
    "input": [
        {
            "role": "user",
            "content": "What are the three largest cities in Spain?"
        },
        {
            "role": "model",
            "content": "The three largest cities in Spain are Madrid, Barcelona, and Valencia."
        },
        {
            "role": "user",
            "content": "What is the most famous landmark in the second one?"
        }
    ]
}'

Fonctionnalités multimodales

Vous pouvez utiliser l'API Interactions pour des cas d'utilisation multimodaux tels que la compréhension d'images ou la génération de vidéos.

Compréhension multimodale

Vous pouvez fournir des entrées multimodales sous forme de données encodées en base64 en ligne, en utilisant l'API Files pour les fichiers plus volumineux ou en transmettant un lien accessible au public dans le champ "uri". Les exemples de code suivants illustrent la méthode d'URL publique.

Compréhension des images

Python

from google import genai
client = genai.Client()

interaction = client.interactions.create(
    model="gemini-3-flash-preview",
    input=[
        {"type": "text", "text": "Describe the image."},
        {
            "type": "image",
            "uri": "YOUR_URL",
            "mime_type": "image/png"
        }
    ]
)
print(interaction.outputs[-1].text)

JavaScript

import {GoogleGenAI} from '@google/genai';

const client = new GoogleGenAI({});

const interaction = await client.interactions.create({
    model: 'gemini-3-flash-preview',
    input: [
        {type: 'text', text: 'Describe the image.'},
        {
            type: 'image',
            uri: 'YOUR_URL',
            mime_type: 'image/png'
        }
    ]
});
console.log(interaction.outputs[interaction.outputs.length - 1].text);

REST

curl -X POST "https://generativelanguage.googleapis.com/v1beta/interactions" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-H "Content-Type: application/json" \
-d '{
    "model": "gemini-3-flash-preview",
    "input": [
    {
        "type": "text",
        "text": "Describe the image."
    },
    {
        "type": "image",
        "uri": "YOUR_URL",
        "mime_type": "image/png"
    }
    ]
}'

Compréhension audio

Python

from google import genai
client = genai.Client()

interaction = client.interactions.create(
    model="gemini-3-flash-preview",
    input=[
        {"type": "text", "text": "What does this audio say?"},
        {
            "type": "audio",
            "uri": "YOUR_URL",
            "mime_type": "audio/wav"
        }
    ]
)
print(interaction.outputs[-1].text)

JavaScript

import { GoogleGenAI } from '@google/genai';

const client = new GoogleGenAI({});

const interaction = await client.interactions.create({
    model: 'gemini-3-flash-preview',
    input: [
        { type: 'text', text: 'What does this audio say?' },
        {
            type: 'audio',
            uri: 'YOUR_URL',
            mime_type: 'audio/wav'
        }
    ]
});

console.log(interaction.outputs[interaction.outputs.length - 1].text);

REST

curl -X POST "https://generativelanguage.googleapis.com/v1beta/interactions" \
-H "Content-Type: application/json" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-d '{
    "model": "gemini-3-flash-preview",
    "input": [
        {"type": "text", "text": "What does this audio say?"},
        {
            "type": "audio",
            "uri": "YOUR_URL",
            "mime_type": "audio/wav"
        }
    ]
}'

Compréhension des vidéos

Python

from google import genai
client = genai.Client()

print("Analyzing video...")
interaction = client.interactions.create(
    model="gemini-3-flash-preview",
    input=[
        {"type": "text", "text": "What is happening in this video? Provide a timestamped summary."},
        {
            "type": "video",
            "uri": "YOUR_URL",
            "mime_type": "video/mp4"
        }
    ]
)

print(interaction.outputs[-1].text)

JavaScript

import { GoogleGenAI } from '@google/genai';

const client = new GoogleGenAI({});

console.log('Analyzing video...');
const interaction = await client.interactions.create({
    model: 'gemini-3-flash-preview',
    input: [
        { type: 'text', text: 'What is happening in this video? Provide a timestamped summary.' },
        {
            type: 'video',
            uri: 'YOUR_URL',
            mime_type: 'video/mp4'
        }
    ]
});

console.log(interaction.outputs[interaction.outputs.length - 1].text);

REST

curl -X POST "https://generativelanguage.googleapis.com/v1beta/interactions" \
-H "Content-Type: application/json" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-d '{
    "model": "gemini-3-flash-preview",
    "input": [
        {"type": "text", "text": "What is happening in this video?"},
        {
            "type": "video",
            "uri": "YOUR_URL",
            "mime_type": "video/mp4"
        }
    ]
}'

Compréhension des documents (PDF)

Python

from google import genai
client = genai.Client()

interaction = client.interactions.create(
    model="gemini-3-flash-preview",
    input=[
        {"type": "text", "text": "What is this document about?"},
        {
            "type": "document",
            "uri": "YOUR_URL",
            "mime_type": "application/pdf"
        }
    ]
)
print(interaction.outputs[-1].text)

JavaScript

import { GoogleGenAI } from '@google/genai';

const client = new GoogleGenAI({});

const interaction = await client.interactions.create({
    model: 'gemini-3-flash-preview',
    input: [
        { type: 'text', text: 'What is this document about?' },
        {
            type: 'document',
            uri: 'YOUR_URL',
            mime_type: 'application/pdf'
        }
    ],
});
console.log(interaction.outputs[0].text);

REST

curl -X POST "https://generativelanguage.googleapis.com/v1beta/interactions" \
-H "Content-Type: application/json" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-d '{
    "model": "gemini-3-flash-preview",
    "input": [
        {"type": "text", "text": "What is this document about?"},
        {
            "type": "document",
            "uri": "YOUR_URL",
            "mime_type": "application/pdf"
        }
    ]
}'

Génération multimodale

Vous pouvez utiliser l'API Interactions pour générer des sorties multimodales.

Génération d'images

Python

import base64
from google import genai

client = genai.Client()

interaction = client.interactions.create(
    model="gemini-3-pro-image-preview",
    input="Generate an image of a futuristic city.",
    response_modalities=["IMAGE"]
)

for output in interaction.outputs:
    if output.type == "image":
        print(f"Generated image with mime_type: {output.mime_type}")
        # Save the image
        with open("generated_city.png", "wb") as f:
            f.write(base64.b64decode(output.data))

JavaScript

import { GoogleGenAI } from '@google/genai';
import * as fs from 'fs';

const client = new GoogleGenAI({});

const interaction = await client.interactions.create({
    model: 'gemini-3-pro-image-preview',
    input: 'Generate an image of a futuristic city.',
    response_modalities: ['IMAGE']
});

for (const output of interaction.outputs) {
    if (output.type === 'image') {
        console.log(`Generated image with mime_type: ${output.mime_type}`);
        // Save the image
        fs.writeFileSync('generated_city.png', Buffer.from(output.data, 'base64'));
    }
}

REST

curl -X POST "https://generativelanguage.googleapis.com/v1beta/interactions" \
-H "Content-Type: application/json" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-d '{
    "model": "gemini-3-pro-image-preview",
    "input": "Generate an image of a futuristic city.",
    "response_modalities": ["IMAGE"]
}'

Configurer la sortie d'image

Vous pouvez personnaliser les images générées à l'aide de image_config dans generation_config pour contrôler les proportions et la résolution.

Paramètre	Options	Description
`aspect_ratio`	`1:1`, `2:3`, `3:2`, `3:4`, `4:3`, `4:5`, `5:4`, `9:16`, `16:9`, `21:9`	Contrôle le rapport largeur/hauteur de l'image de sortie.
`image_size`	`1k`, `2k`, `4k`	Définit la résolution de l'image de sortie.

Python

import base64
from google import genai

client = genai.Client()

interaction = client.interactions.create(
    model="gemini-3-pro-image-preview",
    input="Generate an image of a futuristic city.",
    generation_config={
        "image_config": {
            "aspect_ratio": "9:16",
            "image_size": "2k"
        }
    }
)

for output in interaction.outputs:
    if output.type == "image":
        print(f"Generated image with mime_type: {output.mime_type}")
        # Save the image
        with open("generated_city.png", "wb") as f:
            f.write(base64.b64decode(output.data))

JavaScript

import { GoogleGenAI } from '@google/genai';
import * as fs from 'fs';

const client = new GoogleGenAI({});

const interaction = await client.interactions.create({
    model: 'gemini-3-pro-image-preview',
    input: 'Generate an image of a futuristic city.',
    generation_config: {
        image_config: {
            aspect_ratio: '9:16',
            image_size: '2k'
        }
    }
});

for (const output of interaction.outputs) {
    if (output.type === 'image') {
        console.log(`Generated image with mime_type: ${output.mime_type}`);
        // Save the image
        fs.writeFileSync('generated_city.png', Buffer.from(output.data, 'base64'));
    }
}

REST

curl -X POST "https://generativelanguage.googleapis.com/v1beta/interactions" \
-H "Content-Type: application/json" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-d '{
    "model": "gemini-3-pro-image-preview",
    "input": "Generate an image of a futuristic city.",
    "generation_config": {
        "image_config": {
            "aspect_ratio": "9:16",
            "image_size": "2k"
        }
    }
}'

Génération de voix

Générez un discours naturel à partir de texte à l'aide du modèle de synthèse vocale. Configurez les paramètres de voix, de langue et de locuteur avec le paramètre speech_config
.

Python

import base64
from google import genai
import wave

# Set up the wave file to save the output:
def wave_file(filename, pcm, channels=1, rate=24000, sample_width=2):
    with wave.open(filename, "wb") as wf:
        wf.setnchannels(channels)
        wf.setsampwidth(sample_width)
        wf.setframerate(rate)
        wf.writeframes(pcm)

client = genai.Client()

interaction = client.interactions.create(
    model="gemini-2.5-flash-preview-tts",
    input="Say the following: WOOHOO This is so much fun!.",
    response_modalities=["AUDIO"],
    generation_config={
        "speech_config": {
            "language": "en-us",
            "voice": "kore"
        }
    }
)

for output in interaction.outputs:
    if output.type == "audio":
        print(f"Generated audio with mime_type: {output.mime_type}")
        # Save the audio as wave file to the current directory.
        wave_file("generated_audio.wav", base64.b64decode(output.data))

JavaScript

import { GoogleGenAI } from '@google/genai';
import * as fs from 'fs';
import wav from 'wav';

async function saveWaveFile(
    filename,
    pcmData,
    channels = 1,
    rate = 24000,
    sampleWidth = 2,
) {
    return new Promise((resolve, reject) => {
        const writer = new wav.FileWriter(filename, {
                channels,
                sampleRate: rate,
                bitDepth: sampleWidth * 8,
        });

        writer.on('finish', resolve);
        writer.on('error', reject);

        writer.write(pcmData);
        writer.end();
    });
}

async function main() {
    const GEMINI_API_KEY = process.env.GEMINI_API_KEY;
    const client = new GoogleGenAI({apiKey: GEMINI_API_KEY});

    const interaction = await client.interactions.create({
        model: 'gemini-2.5-flash-preview-tts',
        input: 'Say the following: WOOHOO This is so much fun!.',
        response_modalities: ['AUDIO'],
        generation_config: {
            speech_config: {
                language: "en-us",
                voice: "kore"
            }
        }
    });

    for (const output of interaction.outputs) {
        if (output.type === 'audio') {
            console.log(`Generated audio with mime_type: ${output.mime_type}`);
            const audioBuffer = Buffer.from(output.data, 'base64');
            // Save the audio as wave file to the current directory
            await saveWaveFile("generated_audio.wav", audioBuffer);
        }
    }
}
await main();

REST

curl -X POST "https://generativelanguage.googleapis.com/v1beta/interactions" \
-H "Content-Type: application/json" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-d '{
    "model": "gemini-2.5-flash-preview-tts",
    "input": "Say the following: WOOHOO This is so much fun!.",
    "response_modalities": ["AUDIO"],
    "generation_config": {
        "speech_config": {
            "language": "en-us",
            "voice": "kore"
        }
    }
}' | jq -r '.outputs[] | select(.type == "audio") | .data' | base64 -d > generated_audio.pcm
# You may need to install ffmpeg.
ffmpeg -f s16le -ar 24000 -ac 1 -i generated_audio.pcm generated_audio.wav

Génération de la parole pour plusieurs locuteurs

Générez de la parole avec plusieurs locuteurs en spécifiant les noms des locuteurs dans l'invite et en les faisant correspondre dans speech_config.

Le nom des locuteurs doit figurer dans la requête :

TTS the following conversation between Alice and Bob:
Alice: Hi Bob, how are you doing today?
Bob: I'm doing great, thanks for asking! How about you?
Alice: Fantastic! I just learned about the Gemini API.

Configurez ensuite speech_config avec les enceintes correspondantes :

"generation_config": {
    "speech_config": [
        {"voice": "Zephyr", "speaker": "Alice", "language": "en-US"},
        {"voice": "Puck", "speaker": "Bob", "language": "en-US"}
    ]
}

Capacités agentives

L'API Interactions est conçue pour créer des agents et interagir avec eux. Elle est compatible avec l'appel de fonctions, les outils intégrés, les sorties structurées et le protocole MCP (Model Context Protocol).

Agents

Vous pouvez utiliser des agents spécialisés tels que deep-research-pro-preview-12-2025 pour les tâches complexes. Pour en savoir plus sur l'agent Deep Research de Gemini, consultez le guide Deep Research.

Python

import time
from google import genai

client = genai.Client()

# 1. Start the Deep Research Agent
initial_interaction = client.interactions.create(
    input="Research the history of the Google TPUs with a focus on 2025 and 2026.",
    agent="deep-research-pro-preview-12-2025",
    background=True
)

print(f"Research started. Interaction ID: {initial_interaction.id}")

# 2. Poll for results
while True:
    interaction = client.interactions.get(initial_interaction.id)
    print(f"Status: {interaction.status}")

    if interaction.status == "completed":
        print("\nFinal Report:\n", interaction.outputs[-1].text)
        break
    elif interaction.status in ["failed", "cancelled"]:
        print(f"Failed with status: {interaction.status}")
        break

    time.sleep(10)

JavaScript

import { GoogleGenAI } from '@google/genai';

const client = new GoogleGenAI({});

// 1. Start the Deep Research Agent
const initialInteraction = await client.interactions.create({
    input: 'Research the history of the Google TPUs with a focus on 2025 and 2026.',
    agent: 'deep-research-pro-preview-12-2025',
    background: true
});

console.log(`Research started. Interaction ID: ${initialInteraction.id}`);

// 2. Poll for results
while (true) {
    const interaction = await client.interactions.get(initialInteraction.id);
    console.log(`Status: ${interaction.status}`);

    if (interaction.status === 'completed') {
        console.log('\nFinal Report:\n', interaction.outputs[interaction.outputs.length - 1].text);
        break;
    } else if (['failed', 'cancelled'].includes(interaction.status)) {
        console.log(`Failed with status: ${interaction.status}`);
        break;
    }

    await new Promise(resolve => setTimeout(resolve, 10000));
}

REST

# 1. Start the Deep Research Agent
curl -X POST "https://generativelanguage.googleapis.com/v1beta/interactions" \
-H "Content-Type: application/json" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-d '{
    "input": "Research the history of the Google TPUs with a focus on 2025 and 2026.",
    "agent": "deep-research-pro-preview-12-2025",
    "background": true
}'

# 2. Poll for results (Replace INTERACTION_ID with the ID from the previous interaction)
# curl -X GET "https://generativelanguage.googleapis.com/v1beta/interactions/INTERACTION_ID" \
# -H "x-goog-api-key: $GEMINI_API_KEY"

Outils et appels de fonction

Cette section explique comment utiliser l'appel de fonction pour définir des outils personnalisés et comment utiliser les outils intégrés de Google dans l'API Interactions.

Appel de fonction

Python

from google import genai

client = genai.Client()

# 1. Define the tool
def get_weather(location: str):
    """Gets the weather for a given location."""
    return f"The weather in {location} is sunny."

weather_tool = {
    "type": "function",
    "name": "get_weather",
    "description": "Gets the weather for a given location.",
    "parameters": {
        "type": "object",
        "properties": {
            "location": {"type": "string", "description": "The city and state, e.g. San Francisco, CA"}
        },
        "required": ["location"]
    }
}

# 2. Send the request with tools
interaction = client.interactions.create(
    model="gemini-3-flash-preview",
    input="What is the weather in Paris?",
    tools=[weather_tool]
)

# 3. Handle the tool call
for output in interaction.outputs:
    if output.type == "function_call":
        print(f"Tool Call: {output.name}({output.arguments})")
        # Execute tool
        result = get_weather(**output.arguments)

        # Send result back
        interaction = client.interactions.create(
            model="gemini-3-flash-preview",
            previous_interaction_id=interaction.id,
            input=[{
                "type": "function_result",
                "name": output.name,
                "call_id": output.id,
                "result": result
            }]
        )
        print(f"Response: {interaction.outputs[-1].text}")

JavaScript

import { GoogleGenAI } from '@google/genai';

const client = new GoogleGenAI({});

// 1. Define the tool
const weatherTool = {
    type: 'function',
    name: 'get_weather',
    description: 'Gets the weather for a given location.',
    parameters: {
        type: 'object',
        properties: {
            location: { type: 'string', description: 'The city and state, e.g. San Francisco, CA' }
        },
        required: ['location']
    }
};

// 2. Send the request with tools
let interaction = await client.interactions.create({
    model: 'gemini-3-flash-preview',
    input: 'What is the weather in Paris?',
    tools: [weatherTool]
});

// 3. Handle the tool call
for (const output of interaction.outputs) {
    if (output.type === 'function_call') {
        console.log(`Tool Call: ${output.name}(${JSON.stringify(output.arguments)})`);

        // Execute tool (Mocked)
        const result = `The weather in ${output.arguments.location} is sunny.`;

        // Send result back
        interaction = await client.interactions.create({
            model: 'gemini-3-flash-preview',
            previous_interaction_id:interaction.id,
            input: [{
                type: 'function_result',
                name: output.name,
                call_id: output.id,
                result: result
            }]
        });
        console.log(`Response: ${interaction.outputs[interaction.outputs.length - 1].text}`);
    }
}

REST

curl -X POST "https://generativelanguage.googleapis.com/v1beta/interactions" \
-H "Content-Type: application/json" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-d '{
    "model": "gemini-3-flash-preview",
    "input": "What is the weather in Paris?",
    "tools": [{
        "type": "function",
        "name": "get_weather",
        "description": "Gets the weather for a given location.",
        "parameters": {
            "type": "object",
            "properties": {
                "location": {"type": "string", "description": "The city and state, e.g. San Francisco, CA"}
            },
            "required": ["location"]
        }
    }]
}'

# Handle the tool call and send result back (Replace INTERACTION_ID and CALL_ID)
# curl -X POST "https://generativelanguage.googleapis.com/v1beta/interactions" \
# -H "Content-Type: application/json" \
# -H "x-goog-api-key: $GEMINI_API_KEY" \
# -d '{
#     "model": "gemini-3-flash-preview",
#     "previous_interaction_id": "INTERACTION_ID",
#     "input": [{
#         "type": "function_result",
#         "name": "get_weather",
#         "call_id": "FUNCTION_CALL_ID",
#         "result": "The weather in Paris is sunny."
#     }]
# }'

Appel de fonction avec état côté client

Si vous ne souhaitez pas utiliser l'état côté serveur, vous pouvez tout gérer côté client.

Python

from google import genai
client = genai.Client()

functions = [
    {
        "type": "function",
        "name": "schedule_meeting",
        "description": "Schedules a meeting with specified attendees at a given time and date.",
        "parameters": {
            "type": "object",
            "properties": {
                "attendees": {"type": "array", "items": {"type": "string"}},
                "date": {"type": "string", "description": "Date of the meeting (e.g., 2024-07-29)"},
                "time": {"type": "string", "description": "Time of the meeting (e.g., 15:00)"},
                "topic": {"type": "string", "description": "The subject of the meeting."},
            },
            "required": ["attendees", "date", "time", "topic"],
        },
    }
]

history = [{"role": "user","content": [{"type": "text", "text": "Schedule a meeting for 2025-11-01 at 10 am with Peter and Amir about the Next Gen API."}]}]

# 1. Model decides to call the function
interaction = client.interactions.create(
    model="gemini-3-flash-preview",
    input=history,
    tools=functions
)

# add model interaction back to history
history.append({"role": "model", "content": interaction.outputs})

for output in interaction.outputs:
    if output.type == "function_call":
        print(f"Function call: {output.name} with arguments {output.arguments}")

        # 2. Execute the function and get a result
        # In a real app, you would call your function here.
        # call_result = schedule_meeting(**json.loads(output.arguments))
        call_result = "Meeting scheduled successfully."

        # 3. Send the result back to the model
        history.append({"role": "user", "content": [{"type": "function_result", "name": output.name, "call_id": output.id, "result": call_result}]})

        interaction2 = client.interactions.create(
            model="gemini-3-flash-preview",
            input=history,
        )
        print(f"Final response: {interaction2.outputs[-1].text}")
    else:
        print(f"Output: {output}")

JavaScript

// 1. Define the tool
const functions = [
    {
        type: 'function',
        name: 'schedule_meeting',
        description: 'Schedules a meeting with specified attendees at a given time and date.',
        parameters: {
            type: 'object',
            properties: {
                attendees: { type: 'array', items: { type: 'string' } },
                date: { type: 'string', description: 'Date of the meeting (e.g., 2024-07-29)' },
                time: { type: 'string', description: 'Time of the meeting (e.g., 15:00)' },
                topic: { type: 'string', description: 'The subject of the meeting.' },
            },
            required: ['attendees', 'date', 'time', 'topic'],
        },
    },
];

const history = [
    { role: 'user', content: [{ type: 'text', text: 'Schedule a meeting for 2025-11-01 at 10 am with Peter and Amir about the Next Gen API.' }] }
];

// 2. Model decides to call the function
let interaction = await client.interactions.create({
    model: 'gemini-3-flash-preview',
    input: history,
    tools: functions
});

// add model interaction back to history
history.push({ role: 'model', content: interaction.outputs });

for (const output of interaction.outputs) {
    if (output.type === 'function_call') {
        console.log(`Function call: ${output.name} with arguments ${JSON.stringify(output.arguments)}`);

        // 3. Send the result back to the model
        history.push({ role: 'user', content: [{ type: 'function_result', name: output.name, call_id: output.id, result: 'Meeting scheduled successfully.' }] });

        const interaction2 = await client.interactions.create({
            model: 'gemini-3-flash-preview',
            input: history,
        });
        console.log(`Final response: ${interaction2.outputs[interaction2.outputs.length - 1].text}`);
    }
}

Résultats de fonction multimodaux

Le champ result d'un function_result accepte une chaîne de caractères simple ou un tableau d'objets TextContent et ImageContent. Cela vous permet de renvoyer des images, telles que des captures d'écran ou des graphiques, en plus du texte de vos appels de fonction. Le modèle peut ainsi raisonner sur la sortie visuelle.

Python

import base64
from google import genai

client = genai.Client()

functions = [
    {
        "type": "function",
        "name": "take_screenshot",
        "description": "Takes a screenshot of a specified website.",
        "parameters": {
            "type": "object",
            "properties": {
                "url": {"type": "string", "description": "The URL to take a screenshot of."},
            },
            "required": ["url"],
        },
    }
]

# 1. Model decides to call the function
interaction = client.interactions.create(
    model="gemini-3-flash-preview",
    input="Can you take a screenshot of https://google.com and tell me what you see?",
    tools=functions
)

for output in interaction.outputs:
    if output.type == "function_call":
        print(f"Function call: {output.name}({output.arguments})")

        # 2. Execute the function and load the image
        # Replace with actual function call, pseudo code for reading image from disk
        with open("screenshot.png", "rb") as f:
            base64_image = base64.b64encode(f.read()).decode("utf-8")

        # 3. Return a multimodal result (text + image)
        call_result = [
            {"type": "text", "text": "Screenshot captured successfully."},
            {"type": "image", "mime_type": "image/png", "data": base64_image}
        ]

        response = client.interactions.create(
            model="gemini-3-flash-preview",
            tools=functions,
            previous_interaction_id=interaction.id,
            input=[{
                "type": "function_result",
                "name": output.name,
                "call_id": output.id,
                "result": call_result
            }]
        )
        print(f"Response: {response.outputs[-1].text}")

JavaScript

import { GoogleGenAI } from '@google/genai';
import * as fs from 'fs';

const client = new GoogleGenAI({});

const functions = [
    {
        type: 'function',
        name: 'take_screenshot',
        description: 'Takes a screenshot of a specified website.',
        parameters: {
            type: 'object',
            properties: {
                url: { type: 'string', description: 'The URL to take a screenshot of.' },
            },
            required: ['url'],
        },
    }
];

// 1. Model decides to call the function
let interaction = await client.interactions.create({
    model: 'gemini-3-flash-preview',
    input: 'Can you take a screenshot of https://google.com and tell me what you see?',
    tools: functions
});

for (const output of interaction.outputs) {
    if (output.type === 'function_call') {
        console.log(`Function call: ${output.name}(${JSON.stringify(output.arguments)})`);

        // 2. Execute the function and load the image
        // Replace with actual function call, pseudo code for reading image from disk
        const base64Image = fs.readFileSync('screenshot.png').toString('base64');

        // 3. Return a multimodal result (text + image)
        const callResult = [
            { type: 'text', text: 'Screenshot captured successfully.' },
            { type: 'image', mime_type: 'image/png', data: base64Image }
        ];

        const response = await client.interactions.create({
            model: 'gemini-3-flash-preview',
            tools: functions,
            previous_interaction_id: interaction.id,
            input: [{
                type: 'function_result',
                name: output.name,
                call_id: output.id,
                result: callResult
            }]
        });
        console.log(`Response: ${response.outputs[response.outputs.length - 1].text}`);
    }
}

REST

# 1. Send request with tools (will return a function_call)
curl -X POST "https://generativelanguage.googleapis.com/v1beta/interactions" \
-H "Content-Type: application/json" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-d '{
    "model": "gemini-3-flash-preview",
    "input": "Can you take a screenshot of https://google.com and tell me what you see?",
    "tools": [{
        "type": "function",
        "name": "take_screenshot",
        "description": "Takes a screenshot of a specified website.",
        "parameters": {
            "type": "object",
            "properties": {
                "url": {"type": "string", "description": "The URL to take a screenshot of."}
            },
            "required": ["url"]
        }
    }]
}'

# 2. Send multimodal result back (Replace INTERACTION_ID, CALL_ID, and BASE64_IMAGE)
# curl -X POST "https://generativelanguage.googleapis.com/v1beta/interactions" \
# -H "Content-Type: application/json" \
# -H "x-goog-api-key: $GEMINI_API_KEY" \
# -d '{
#     "model": "gemini-3-flash-preview",
#     "tools": [{"type": "function", "name": "take_screenshot", "description": "Takes a screenshot of a specified website.", "parameters": {"type": "object", "properties": {"url": {"type": "string"}}, "required": ["url"]}}],
#     "previous_interaction_id": "INTERACTION_ID",
#     "input": [{
#         "type": "function_result",
#         "name": "take_screenshot",
#         "call_id": "CALL_ID",
#         "result": [
#             {"type": "text", "text": "Screenshot captured successfully."},
#             {"type": "image", "mime_type": "image/png", "data": "BASE64_IMAGE"}
#         ]
#     }]
# }'

Outils intégrés

Gemini est fourni avec des outils intégrés tels que l'ancrage avec la recherche Google, l'exécution de code, le contexte d'URL et l'utilisation de l'ordinateur.

Ancrage avec la recherche Google

Python

from google import genai

client = genai.Client()

interaction = client.interactions.create(
    model="gemini-3-flash-preview",
    input="Who won the last Super Bowl?",
    tools=[{"type": "google_search"}]
)
# Find the text output (not the GoogleSearchResultContent)
text_output = next((o for o in interaction.outputs if o.type == "text"), None)
if text_output:
    print(text_output.text)

JavaScript

import { GoogleGenAI } from '@google/genai';

const client = new GoogleGenAI({});

const interaction = await client.interactions.create({
    model: 'gemini-3-flash-preview',
    input: 'Who won the last Super Bowl?',
    tools: [{ type: 'google_search' }]
});
// Find the text output (not the GoogleSearchResultContent)
const textOutput = interaction.outputs.find(o => o.type === 'text');
if (textOutput) console.log(textOutput.text);

REST

curl -X POST "https://generativelanguage.googleapis.com/v1beta/interactions" \
-H "Content-Type: application/json" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-d '{
    "model": "gemini-3-flash-preview",
    "input": "Who won the last Super Bowl?",
    "tools": [{"type": "google_search"}]
}'

Exécution du code

Python

from google import genai

client = genai.Client()

interaction = client.interactions.create(
    model="gemini-3-flash-preview",
    input="Calculate the 50th Fibonacci number.",
    tools=[{"type": "code_execution"}]
)
print(interaction.outputs[-1].text)

JavaScript

import { GoogleGenAI } from '@google/genai';

const client = new GoogleGenAI({});

const interaction = await client.interactions.create({
    model: 'gemini-3-flash-preview',
    input: 'Calculate the 50th Fibonacci number.',
    tools: [{ type: 'code_execution' }]
});
console.log(interaction.outputs[interaction.outputs.length - 1].text);

REST

curl -X POST "https://generativelanguage.googleapis.com/v1beta/interactions" \
-H "Content-Type: application/json" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-d '{
    "model": "gemini-3-flash-preview",
    "input": "Calculate the 50th Fibonacci number.",
    "tools": [{"type": "code_execution"}]
}'

Contexte de l'URL

Python

from google import genai

client = genai.Client()

interaction = client.interactions.create(
    model="gemini-3-flash-preview",
    input="Summarize the content of https://www.wikipedia.org/",
    tools=[{"type": "url_context"}]
)
# Find the text output (not the URLContextResultContent)
text_output = next((o for o in interaction.outputs if o.type == "text"), None)
if text_output:
    print(text_output.text)

JavaScript

import { GoogleGenAI } from '@google/genai';

const client = new GoogleGenAI({});

const interaction = await client.interactions.create({
    model: 'gemini-3-flash-preview',
    input: 'Summarize the content of https://www.wikipedia.org/',
    tools: [{ type: 'url_context' }]
});
// Find the text output (not the URLContextResultContent)
const textOutput = interaction.outputs.find(o => o.type === 'text');
if (textOutput) console.log(textOutput.text);

REST

curl -X POST "https://generativelanguage.googleapis.com/v1beta/interactions" \
-H "Content-Type: application/json" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-d '{
    "model": "gemini-3-flash-preview",
    "input": "Summarize the content of https://www.wikipedia.org/",
    "tools": [{"type": "url_context"}]
}'

Utilisation d'un ordinateur

Python

from google import genai

client = genai.Client()

interaction = client.interactions.create(
    model="gemini-2.5-computer-use-preview-10-2025",
    input="Search for highly rated smart fridges with touchscreen, 2 doors, around 25 cu ft, priced below 4000 dollars on Google Shopping. Create a bulleted list of the 3 cheapest options in the format of name, description, price in an easy-to-read layout.",
    tools=[{
        "type": "computer_use",
        "environment": "browser",
        "excludedPredefinedFunctions": ["drag_and_drop"]
    }]
)

# The response will contain tool calls (actions) for the computer interface
# or text explaining the action
for output in interaction.outputs:
    print(output)

JavaScript

import { GoogleGenAI } from '@google/genai';

const client = new GoogleGenAI({});

const interaction = await client.interactions.create({
    model: 'gemini-2.5-computer-use-preview-10-2025',
    input: 'Search for highly rated smart fridges with touchscreen, 2 doors, around 25 cu ft, priced below 4000 dollars on Google Shopping. Create a bulleted list of the 3 cheapest options in the format of name, description, price in an easy-to-read layout.',
    tools: [{
        type: 'computer_use',
        environment: 'browser',
        excludedPredefinedFunctions: ['drag_and_drop']
    }]
});

// The response will contain tool calls (actions) for the computer interface
// or text explaining the action
interaction.outputs.forEach(output => console.log(output));

REST

curl -X POST "https://generativelanguage.googleapis.com/v1beta/interactions" \
-H "Content-Type: application/json" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-d '{
    "model": "gemini-2.5-computer-use-preview-10-2025",
    "input": "Search for highly rated smart fridges with touchscreen, 2 doors, around 25 cu ft, priced below 4000 dollars on Google Shopping. Create a bulleted list of the 3 cheapest options in the format of name, description, price in an easy-to-read layout.",
    "tools": [{
        "type": "computer_use",
        "environment": "browser",
        "excludedPredefinedFunctions": ["drag_and_drop"]
    }]
}'

Protocole MCP (Model Context Protocol) à distance

L'intégration MCP à distance simplifie le développement d'agents en permettant à l'API Gemini d'appeler directement des outils externes hébergés sur des serveurs distants.

Python

import datetime
from google import genai

client = genai.Client()

mcp_server = {
    "type": "mcp_server",
    "name": "weather_service",
    "url": "https://gemini-api-demos.uc.r.appspot.com/mcp"
}

today = datetime.date.today().strftime("%d %B %Y")

interaction = client.interactions.create(
    model="gemini-2.5-flash",
    input="What is the weather like in New York today?",
    tools=[mcp_server],
    system_instruction=f"Today is {today}."
)

print(interaction.outputs[-1].text)

JavaScript

import { GoogleGenAI } from '@google/genai';

const client = new GoogleGenAI({});

const mcpServer = {
    type: 'mcp_server',
    name: 'weather_service',
    url: 'https://gemini-api-demos.uc.r.appspot.com/mcp'
};

const today = new Date().toDateString();

const interaction = await client.interactions.create({
    model: 'gemini-2.5-flash',
    input: 'What is the weather like in New York today?',
    tools: [mcpServer],
    system_instruction: `Today is ${today}.`
});

console.log(interaction.outputs[interaction.outputs.length - 1].text);

REST

curl -X POST "https://generativelanguage.googleapis.com/v1beta/interactions" \
-H "Content-Type: application/json" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-d '{
    "model": "gemini-2.5-flash",
    "input": "What is the weather like in New York today?",
    "tools": [{
        "type": "mcp_server",
        "name": "weather_service",
        "url": "https://gemini-api-demos.uc.r.appspot.com/mcp"
    }],
    "system_instruction": "Today is '"$(date +"%du%Bt%Y")"' YYYY-MM-DD>."
}'

Remarques importantes :

Le MCP à distance ne fonctionne qu'avec les serveurs HTTP Streamable (les serveurs SSE ne sont pas compatibles).
Le MCP à distance ne fonctionne pas avec les modèles Gemini 3 (cette fonctionnalité sera bientôt disponible).
Les noms de serveur MCP ne doivent pas inclure le caractère "-" (utilisez plutôt des noms de serveur snake_case).

Sortie structurée (schéma JSON)

Appliquez un format de sortie JSON spécifique en fournissant un schéma JSON dans le paramètre response_format. Cela est utile pour des tâches telles que la modération, la classification ou l'extraction de données.

Python

from google import genai
from pydantic import BaseModel, Field
from typing import Literal, Union
client = genai.Client()

class SpamDetails(BaseModel):
    reason: str = Field(description="The reason why the content is considered spam.")
    spam_type: Literal["phishing", "scam", "unsolicited promotion", "other"]

class NotSpamDetails(BaseModel):
    summary: str = Field(description="A brief summary of the content.")
    is_safe: bool = Field(description="Whether the content is safe for all audiences.")

class ModerationResult(BaseModel):
    decision: Union[SpamDetails, NotSpamDetails]

interaction = client.interactions.create(
    model="gemini-3-flash-preview",
    input="Moderate the following content: 'Congratulations! You've won a free cruise. Click here to claim your prize: www.definitely-not-a-scam.com'",
    response_format=ModerationResult.model_json_schema(),
)

parsed_output = ModerationResult.model_validate_json(interaction.outputs[-1].text)
print(parsed_output)

JavaScript

import { GoogleGenAI } from '@google/genai';
import { z } from 'zod';
const client = new GoogleGenAI({});

const moderationSchema = z.object({
    decision: z.union([
        z.object({
            reason: z.string().describe('The reason why the content is considered spam.'),
            spam_type: z.enum(['phishing', 'scam', 'unsolicited promotion', 'other']).describe('The type of spam.'),
        }).describe('Details for content classified as spam.'),
        z.object({
            summary: z.string().describe('A brief summary of the content.'),
            is_safe: z.boolean().describe('Whether the content is safe for all audiences.'),
        }).describe('Details for content classified as not spam.'),
    ]),
});

const interaction = await client.interactions.create({
    model: 'gemini-3-flash-preview',
    input: "Moderate the following content: 'Congratulations! You've won a free cruise. Click here to claim your prize: www.definitely-not-a-scam.com'",
    response_format: z.toJSONSchema(moderationSchema),
});
console.log(interaction.outputs[0].text);

REST

curl -X POST "https://generativelanguage.googleapis.com/v1beta/interactions" \
-H "Content-Type: application/json" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-d '{
    "model": "gemini-3-flash-preview",
    "input": "Moderate the following content: 'Congratulations! You've won a free cruise. Click here to claim your prize: www.definitely-not-a-scam.com'",
    "response_format": {
        "type": "object",
        "properties": {
            "decision": {
                "type": "object",
                "properties": {
                    "reason": {"type": "string", "description": "The reason why the content is considered spam."},
                    "spam_type": {"type": "string", "description": "The type of spam."}
                },
                "required": ["reason", "spam_type"]
            }
        },
        "required": ["decision"]
    }
}'

Combiner des outils et une sortie structurée

Combinez des outils intégrés à une sortie structurée pour obtenir un objet JSON fiable basé sur les informations récupérées par un outil.

Python

from google import genai
from pydantic import BaseModel, Field
from typing import Literal, Union

client = genai.Client()

class SpamDetails(BaseModel):
    reason: str = Field(description="The reason why the content is considered spam.")
    spam_type: Literal["phishing", "scam", "unsolicited promotion", "other"]

class NotSpamDetails(BaseModel):
    summary: str = Field(description="A brief summary of the content.")
    is_safe: bool = Field(description="Whether the content is safe for all audiences.")

class ModerationResult(BaseModel):
    decision: Union[SpamDetails, NotSpamDetails]

interaction = client.interactions.create(
    model="gemini-3-flash-preview",
    input="Moderate the following content: 'Congratulations! You've won a free cruise. Click here to claim your prize: www.definitely-not-a-scam.com'",
    response_format=ModerationResult.model_json_schema(),
    tools=[{"type": "url_context"}]
)

parsed_output = ModerationResult.model_validate_json(interaction.outputs[-1].text)
print(parsed_output)

JavaScript

import { GoogleGenAI } from '@google/genai';
import { z } from 'zod'; // Assuming zod is used for schema generation, or define manually
const client = new GoogleGenAI({});

const obj = z.object({
    winning_team: z.string(),
    score: z.string(),
});
const schema = z.toJSONSchema(obj);

const interaction = await client.interactions.create({
    model: 'gemini-3-flash-preview',
    input: 'Who won the last euro?',
    tools: [{ type: 'google_search' }],
    response_format: schema,
});
console.log(interaction.outputs[0].text);

REST

curl -X POST "https://generativelanguage.googleapis.com/v1beta/interactions" \
-H "Content-Type: application/json" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-d '{
    "model": "gemini-3-flash-preview",
    "input": "Who won the last euro?",
    "tools": [{"type": "google_search"}],
    "response_format": {
        "type": "object",
        "properties": {
            "winning_team": {"type": "string"},
            "score": {"type": "string"}
        }
    }
}'

Fonctionnalités avancées

Des fonctionnalités avancées supplémentaires vous offrent également plus de flexibilité pour travailler avec l'API Interactions.

Streaming

Recevez les réponses de manière incrémentielle à mesure qu'elles sont générées.

Python

from google import genai

client = genai.Client()

stream = client.interactions.create(
    model="gemini-3-flash-preview",
    input="Explain quantum entanglement in simple terms.",
    stream=True
)

for chunk in stream:
    if chunk.event_type == "content.delta":
        if chunk.delta.type == "text":
            print(chunk.delta.text, end="", flush=True)
        elif chunk.delta.type == "thought":
            print(chunk.delta.thought, end="", flush=True)
    elif chunk.event_type == "interaction.complete":
        print(f"\n\n--- Stream Finished ---")
        print(f"Total Tokens: {chunk.interaction.usage.total_tokens}")

JavaScript

import { GoogleGenAI } from '@google/genai';

const client = new GoogleGenAI({});

const stream = await client.interactions.create({
    model: 'gemini-3-flash-preview',
    input: 'Explain quantum entanglement in simple terms.',
    stream: true,
});

for await (const chunk of stream) {
    if (chunk.event_type === 'content.delta') {
        if (chunk.delta.type === 'text' && 'text' in chunk.delta) {
            process.stdout.write(chunk.delta.text);
        } else if (chunk.delta.type === 'thought' && 'thought' in chunk.delta) {
            process.stdout.write(chunk.delta.thought);
        }
    } else if (chunk.event_type === 'interaction.complete') {
        console.log('\n\n--- Stream Finished ---');
        console.log(`Total Tokens: ${chunk.interaction.usage.total_tokens}`);
    }
}

REST

curl -X POST "https://generativelanguage.googleapis.com/v1beta/interactions?alt=sse" \
-H "Content-Type: application/json" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-d '{
    "model": "gemini-3-flash-preview",
    "input": "Explain quantum entanglement in simple terms.",
    "stream": true
}'

Types d'événements de streaming

Lorsque le streaming est activé, l'API renvoie des événements envoyés par le serveur (SSE). Chaque événement comporte un champ event_type indiquant son objectif. La liste complète des types d'événements est disponible dans la documentation de référence de l'API.

Type d'événement	Description
`interaction.start`	Premier événement. Contient l'interaction `id` et l'initiale `status` (`in_progress`).
`interaction.status_update`	Indique les changements d'état (par exemple, `in_progress`).
`content.start`	Marque le début d'un nouveau bloc de sortie. Contient `index` et du contenu `type` (par exemple, `text`, `thought`).
`content.delta`	Mises à jour incrémentielles du contenu. Contient les données partielles associées à la clé `delta.type`.
`content.stop`	Marque la fin d'un bloc de sortie à `index`.
`interaction.complete`	Événement final. Contient `id`, `status`, `usage` et des métadonnées. Remarque : `outputs` est `None`. Vous devez reconstruire les sorties à partir des événements `content.*`.
`error`	Indique qu'une erreur s'est produite. Contient `error.code` et `error.message`.

Reconstruire l'objet Interaction à partir d'événements de streaming

Contrairement aux réponses sans streaming, les réponses avec streaming ne contiennent pas de tableau outputs. Vous devez reconstruire les sorties en accumulant le contenu des événements content.delta.

Python

from google import genai

client = genai.Client()

stream = client.interactions.create(
    model="gemini-3-flash-preview",
    input="Write a haiku about Python programming.",
    stream=True
)

# Accumulate outputs by index
outputs = {}
usage = None

for chunk in stream:
    if chunk.event_type == "content.start":
        outputs[chunk.index] = {"type": chunk.content.type}

    elif chunk.event_type == "content.delta":
        output = outputs[chunk.index]
        if chunk.delta.type == "text":
            output["text"] = output.get("text", "") + chunk.delta.text
        elif chunk.delta.type == "thought_signature":
            output["signature"] = chunk.delta.signature
        elif chunk.delta.type == "thought_summary":
            output["summary"] = output.get("summary", "") + getattr(chunk.delta.content, "text", "")

    elif chunk.event_type == "interaction.complete":
        usage = chunk.interaction.usage

# Final outputs list (sorted by index)
final_outputs = [outputs[i] for i in sorted(outputs.keys())]
print(f"\n\nOutputs: {final_outputs}")

JavaScript

import { GoogleGenAI } from '@google/genai';

const client = new GoogleGenAI({});

const stream = await client.interactions.create({
    model: 'gemini-3-flash-preview',
    input: 'Write a haiku about Python programming.',
    stream: true,
});

// Accumulate outputs by index
const outputs = new Map();
let usage = null;

for await (const chunk of stream) {
    if (chunk.event_type === 'content.start') {
        outputs.set(chunk.index, { type: chunk.content.type });

    } else if (chunk.event_type === 'content.delta') {
        const output = outputs.get(chunk.index);
        if (chunk.delta.type === 'text') {
            output.text = (output.text || '') + chunk.delta.text;
            process.stdout.write(chunk.delta.text);
        } else if (chunk.delta.type === 'thought_signature') {
            output.signature = chunk.delta.signature;
        } else if (chunk.delta.type === 'thought_summary') {
            output.summary = (output.summary || '') + (chunk.delta.content?.text || '');
        }

    } else if (chunk.event_type === 'interaction.complete') {
        usage = chunk.interaction.usage;
    }
}

// Final outputs list (sorted by index)
const finalOutputs = [...outputs.entries()]
    .sort((a, b) => a[0] - b[0])
    .map(([_, output]) => output);
console.log(`\n\nOutputs:`, finalOutputs);

Configuration

Personnalisez le comportement du modèle avec generation_config.

Python

from google import genai

client = genai.Client()

interaction = client.interactions.create(
    model="gemini-3-flash-preview",
    input="Tell me a story about a brave knight.",
    generation_config={
        "temperature": 0.7,
        "max_output_tokens": 500,
        "thinking_level": "low",
    }
)

print(interaction.outputs[-1].text)

JavaScript

import { GoogleGenAI } from '@google/genai';

const client = new GoogleGenAI({});

const interaction = await client.interactions.create({
    model: 'gemini-3-flash-preview',
    input: 'Tell me a story about a brave knight.',
    generation_config: {
        temperature: 0.7,
        max_output_tokens: 500,
        thinking_level: 'low',
    }
});

console.log(interaction.outputs[interaction.outputs.length - 1].text);

REST

curl -X POST "https://generativelanguage.googleapis.com/v1beta/interactions" \
-H "Content-Type: application/json" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-d '{
    "model": "gemini-3-flash-preview",
    "input": "Tell me a story about a brave knight.",
    "generation_config": {
        "temperature": 0.7,
        "max_output_tokens": 500,
        "thinking_level": "low"
    }
}'

Raisonnement

Les modèles Gemini 2.5 et versions ultérieures utilisent un processus de raisonnement interne appelé "réflexion" avant de générer une réponse. Cela aide le modèle à produire de meilleures réponses pour les tâches complexes telles que les mathématiques, le codage et le raisonnement en plusieurs étapes.

Niveau de réflexion

Le paramètre thinking_level vous permet de contrôler la profondeur du raisonnement du modèle :

Niveau	Description	Modèles compatibles
`minimal`	Correspond au paramètre "sans réflexion" pour la plupart des requêtes. Dans certains cas, les modèles peuvent penser de manière très minimale. Minimise la latence et les coûts.	Modèles Flash uniquement (par exemple, Gemini 3 Flash)
`low`	Raisonnement léger qui privilégie la latence et les économies de coûts pour le suivi d'instructions simples et le chat.	Tous les modèles de réflexion
`medium`	Une réflexion équilibrée pour la plupart des tâches.	Modèles Flash uniquement (par exemple, Gemini 3 Flash)
`high`	(Par défaut) : maximise la profondeur du raisonnement. Le modèle peut mettre beaucoup plus de temps à générer le premier jeton, mais le résultat sera plus réfléchi.	Tous les modèles de réflexion

Résumés de la réflexion

La réflexion du modèle est représentée sous la forme de blocs de réflexion (type: "thought") dans les résultats de réponse. Vous pouvez contrôler si vous souhaitez recevoir des résumés lisibles du processus de réflexion à l'aide du paramètre thinking_summaries :

Valeur	Description
`auto`	(Par défaut) : renvoie des résumés de réflexion lorsqu'ils sont disponibles.
`none`	Désactive les résumés de réflexion.

Python

from google import genai

client = genai.Client()

interaction = client.interactions.create(
    model="gemini-3-flash-preview",
    input="Solve this step by step: What is 15% of 240?",
    generation_config={
        "thinking_level": "high",
        "thinking_summaries": "auto"
    }
)

for output in interaction.outputs:
    if output.type == "thought":
        print(f"Thinking: {output.summary}")
    elif output.type == "text":
        print(f"Answer: {output.text}")

JavaScript

import { GoogleGenAI } from '@google/genai';

const client = new GoogleGenAI({});

const interaction = await client.interactions.create({
    model: 'gemini-3-flash-preview',
    input: 'Solve this step by step: What is 15% of 240?',
    generation_config: {
        thinking_level: 'high',
        thinking_summaries: 'auto'
    }
});

for (const output of interaction.outputs) {
    if (output.type === 'thought') {
        console.log(`Thinking: ${output.summary}`);
    } else if (output.type === 'text') {
        console.log(`Answer: ${output.text}`);
    }
}

REST

curl -X POST "https://generativelanguage.googleapis.com/v1beta/interactions" \
-H "Content-Type: application/json" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-d '{
    "model": "gemini-3-flash-preview",
    "input": "Solve this step by step: What is 15% of 240?",
    "generation_config": {
        "thinking_level": "high",
        "thinking_summaries": "auto"
    }
}'

Chaque bloc de réflexion contient un champ signature (un hachage cryptographique de l'état de raisonnement interne) et un champ summary facultatif (un résumé lisible par l'humain du raisonnement du modèle). Le signature est toujours présent, mais un bloc de réflexion peut ne contenir qu'une signature sans résumé dans les cas suivants :

Requêtes simples : le modèle n'a pas suffisamment raisonné pour générer un résumé.
thinking_summaries: "none" : les résumés sont explicitement désactivés

Votre code doit toujours gérer les blocs de réflexion où summary est vide ou absent. Lorsque vous gérez manuellement l'historique des conversations (mode sans état), vous devez inclure des blocs de réflexion avec leurs signatures dans les requêtes suivantes pour valider l'authenticité.

Utiliser des fichiers

Utiliser des fichiers distants

Accédez aux fichiers à l'aide d'URL distantes directement dans l'appel d'API.

Python

from google import genai
client = genai.Client()

interaction = client.interactions.create(
    model="gemini-3-flash-preview",
    input=[
        {
            "type": "image",
            "uri": "https://github.com/<github-path>/cats-and-dogs.jpg",
        },
        {"type": "text", "text": "Describe what you see."}
    ],
)
for output in interaction.outputs:
    if output.type == "text":
        print(output.text)

JavaScript

import { GoogleGenAI } from '@google/genai';
const client = new GoogleGenAI({});

const interaction = await client.interactions.create({
    model: 'gemini-3-flash-preview',
    input: [
        {
            type: 'image',
            uri: 'https://github.com/<github-path>/cats-and-dogs.jpg',
        },
        { type: 'text', text: 'Describe what you see.' }
    ],
});
for (const output of interaction.outputs) {
    if (output.type === 'text') {
        console.log(output.text);
    }
}

REST

curl -X POST "https://generativelanguage.googleapis.com/v1beta/interactions" \
-H "Content-Type: application/json" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-d '{
    "model": "gemini-3-flash-preview",
    "input": [
        {
            "type": "image",
            "uri": "https://github.com/<github-path>/cats-and-dogs.jpg"
        },
        {"type": "text", "text": "Describe what you see."}
    ]
}'

Utiliser l'API Gemini Files

Importez des fichiers dans l'API Files de Gemini avant de les utiliser.

Python

from google import genai
import time
import requests
client = genai.Client()

# 1. Download the file
url = "https://github.com/philschmid/gemini-samples/raw/refs/heads/main/assets/cats-and-dogs.jpg"
response = requests.get(url)
with open("cats-and-dogs.jpg", "wb") as f:
    f.write(response.content)

# 2. Upload to Gemini Files API
file = client.files.upload(file="cats-and-dogs.jpg")

# 3. Wait for processing
while client.files.get(name=file.name).state != "ACTIVE":
    time.sleep(2)

# 4. Use in Interaction
interaction = client.interactions.create(
    model="gemini-3-flash-preview",
    input=[
        {
            "type": "image",
            "uri": file.uri,
        },
        {"type": "text", "text": "Describe what you see."}
    ],
)
for output in interaction.outputs:
    if output.type == "text":
        print(output.text)

JavaScript

import { GoogleGenAI } from '@google/genai';
import * as fs from 'fs';
import fetch from 'node-fetch';
const client = new GoogleGenAI({});

// 1. Download the file
const url = 'https://github.com/philschmid/gemini-samples/raw/refs/heads/main/assets/cats-and-dogs.jpg';
const filename = 'cats-and-dogs.jpg';
const response = await fetch(url);
const buffer = await response.buffer();
fs.writeFileSync(filename, buffer);

// 2. Upload to Gemini Files API
const myfile = await client.files.upload({ file: filename, config: { mimeType: 'image/jpeg' } });

// 3. Wait for processing
while ((await client.files.get({ name: myfile.name })).state !== 'ACTIVE') {
    await new Promise(resolve => setTimeout(resolve, 2000));
}

// 4. Use in Interaction
const interaction = await client.interactions.create({
    model: 'gemini-3-flash-preview',
    input: [
        { type: 'image', uri: myfile.uri, },
        { type: 'text', text: 'Describe what you see.' }
    ],
});
for (const output of interaction.outputs) {
    if (output.type === 'text') {
        console.log(output.text);
    }
}

REST

# 1. Upload the file (Requires File API setup)
# See https://ai.google.dev/gemini-api/docs/files for details.
# Assume FILE_URI is obtained from the upload step.

curl -X POST "https://generativelanguage.googleapis.com/v1beta/interactions" \
-H "Content-Type: application/json" \
-H "x-goog-api-key: $GEMINI_API_KEY" \
-d '{
    "model": "gemini-3-flash-preview",
    "input": [
        {"type": "image", "uri": "FILE_URI"},
        {"type": "text", "text": "Describe what you see."}
    ]
}'

Modèle de données

Pour en savoir plus sur le modèle de données, consultez la documentation de référence de l'API. Vous trouverez ci-dessous une vue d'ensemble des principaux composants.

Interaction

Propriété	Type	Description
`id`	`string`	Identifiant unique de l'interaction.
`model`/`agent`	`string`	Modèle ou agent utilisé. Vous ne pouvez en fournir qu'un seul.
`input`	`Content[]`	Les entrées fournies.
`outputs`	`Content[]`	Les réponses du modèle.
`tools`	`Tool[]`	les outils utilisés.
`previous_interaction_id`	`string`	ID de l'interaction précédente pour le contexte.
`stream`	`boolean`	Indique si l'interaction est en streaming.
`status`	`string`	État : `completed`, `in_progress`, `requires_action`,`failed`, etc.
`background`	`boolean`	Indique si l'interaction est en mode arrière-plan.
`store`	`boolean`	Indique si l'interaction doit être stockée. Valeur par défaut : `true` Définissez sur `false` pour désactiver.
`usage`	Utilisation	Utilisation de jetons pour la demande d'interaction.

Modèles et agents compatibles

Nom du modèle	Type	ID du modèle
Gemini 2.5 Pro	Modèle	`gemini-2.5-pro`
Gemini 2.0 Flash	Modèle	`gemini-2.5-flash`
Gemini 2.5 Flash-Lite	Modèle	`gemini-2.5-flash-lite`
Aperçu de Gemini 3 Pro	Modèle	`gemini-3-pro-preview`
Aperçu de Gemini 3 Flash	Modèle	`gemini-3-flash-preview`
Aperçu de Deep Research	Agent	`deep-research-pro-preview-12-2025`

Fonctionnement de l'API Interactions

L'API Interactions est conçue autour d'une ressource centrale : Interaction. Un Interaction représente un tour complet dans une conversation ou une tâche. Il sert d'enregistrement de session et contient l'historique complet d'une interaction, y compris toutes les entrées utilisateur, les réflexions du modèle, les appels d'outils, les résultats d'outils et les sorties finales du modèle.

Lorsque vous appelez interactions.create, vous créez une ressource Interaction.

Gestion de l'état côté serveur

Vous pouvez utiliser le id d'une interaction terminée dans un appel ultérieur à l'aide du paramètre previous_interaction_id pour poursuivre la conversation. Le serveur utilise cet ID pour récupérer l'historique des conversations, ce qui vous évite d'avoir à renvoyer l'intégralité de l'historique des discussions.

Seul l'historique des conversations (entrées et sorties) est conservé à l'aide de previous_interaction_id. Les autres paramètres sont de portée interaction et ne s'appliquent qu'à l'interaction spécifique que vous générez actuellement :

tools
system_instruction
generation_config (y compris thinking_level, temperature, etc.)

Cela signifie que vous devez spécifier à nouveau ces paramètres dans chaque nouvelle interaction si vous souhaitez qu'ils s'appliquent. Cette gestion de l'état côté serveur est facultative. Vous pouvez également fonctionner en mode sans état en envoyant l'intégralité de l'historique des conversations dans chaque requête.

Stockage et conservation des données

Par défaut, tous les objets Interaction sont stockés (store=true) afin de simplifier l'utilisation des fonctionnalités de gestion de l'état côté serveur (avec previous_interaction_id), l'exécution en arrière-plan (à l'aide de background=true) et à des fins d'observabilité.

Niveau payant : les interactions sont conservées pendant 55 jours.
Niveau sans frais : les interactions sont conservées pendant un jour.

Si vous ne le souhaitez pas, vous pouvez définir store=false dans votre demande. Cette commande est distincte de la gestion de l'état. Vous pouvez désactiver le stockage pour n'importe quelle interaction. Notez toutefois que store=false n'est pas compatible avec background=true et empêche l'utilisation de previous_interaction_id pour les tours suivants.

Vous pouvez supprimer les interactions stockées à tout moment à l'aide de la méthode de suppression disponible dans la documentation de référence de l'API. Vous ne pouvez supprimer des interactions que si vous connaissez leur ID.

Une fois la période de conservation expirée, vos données seront automatiquement supprimées.

Les objets d'interaction sont traités conformément aux Conditions d'utilisation.

Bonnes pratiques

Taux de succès du cache : l'utilisation de previous_interaction_id pour poursuivre les conversations permet au système d'utiliser plus facilement la mise en cache implicite pour l'historique des conversations, ce qui améliore les performances et réduit les coûts.
Combiner les interactions : vous pouvez combiner les interactions avec l'agent et le modèle dans une même conversation. Par exemple, vous pouvez utiliser un agent spécialisé, comme l'agent Deep Research, pour la collecte initiale de données, puis utiliser un modèle Gemini standard pour les tâches de suivi telles que la synthèse ou le reformatage, en associant ces étapes à l'previous_interaction_id.

SDK

Vous pouvez utiliser la dernière version des SDK Google GenAI pour accéder à l'API Interactions.

Sur Python, il s'agit du package google-genai à partir de la version 1.55.0.
Sur JavaScript, il s'agit du package @google/genai à partir de la version 1.33.0.

Pour savoir comment installer les SDK, consultez la page Bibliothèques.

Limites

État bêta : l'API Interactions est en version bêta/preview. Les fonctionnalités et les schémas peuvent changer.
Fonctionnalités non compatibles : Les fonctionnalités suivantes ne sont pas encore disponibles, mais le seront bientôt :
- Ancrage avec Google Maps
Ordre de sortie : l'ordre du contenu des outils intégrés (google_search et url_context) peut parfois être incorrect, avec du texte qui apparaît avant l'exécution et le résultat de l'outil. Il s'agit d'un problème connu et un correctif est en cours.
Combinaisons d'outils : la combinaison de MCP, de l'appel de fonction et des outils intégrés n'est pas encore disponible, mais le sera bientôt.
MCP distant : Gemini 3 n'est pas compatible avec le MCP distant. Cette fonctionnalité sera bientôt disponible.

Modifications importantes

L'API Interactions est actuellement en version bêta préliminaire. Nous développons et affinons activement les fonctionnalités de l'API, les schémas de ressources et les interfaces du SDK en fonction de l'utilisation réelle et des commentaires des développeurs.

Par conséquent, des modifications destructives peuvent se produire. Les mises à jour peuvent inclure des modifications concernant les éléments suivants :

Schémas pour les entrées et les sorties.
Signatures des méthodes du SDK et structures d'objet.
Comportements spécifiques des fonctionnalités.

Pour les charges de travail de production, vous devez continuer à utiliser l'API generateContent standard. Il reste la méthode recommandée pour les déploiements stables et continuera d'être activement développé et géré.

Commentaires

Vos commentaires sont essentiels au développement de l'API Interactions. N'hésitez pas à partager vos commentaires, à signaler des bugs ou à demander des fonctionnalités sur notre forum de la communauté des développeurs Google IA.

Étape suivante

Essayez le notebook de démarrage rapide de l'API Interactions.
En savoir plus sur l'agent Gemini Deep Research