Prueba RAG con la API de Gemini usando la herramienta File Search.

Se usó la API de Cloud Translation para traducir esta página.

Generating content

La API de Gemini admite la generación de contenido con imágenes, audio, código, herramientas y mucho más. Para obtener detalles sobre cada una de estas funciones, sigue leyendo y consulta el código de muestra centrado en tareas o lee las guías integrales.

Método: models.generateContent

Genera una respuesta del modelo a partir de una entrada GenerateContentRequest. Consulta la guía de generación de texto para obtener información detallada sobre el uso. Las capacidades de entrada difieren entre los modelos, incluidos los modelos ajustados. Consulta la guía del modelo y la guía de ajuste para obtener más detalles.

Extremo

post https://generativelanguage.googleapis.com/v1beta/{model=models/*}:generateContent

Parámetros de ruta

model string

Obligatorio. Es el nombre del Model que se usará para generar la finalización.

Formato: models/{model}. Toma la forma models/{model}.

Cuerpo de la solicitud

El cuerpo de la solicitud contiene datos con la siguiente estructura:

Campos

contents[] object (Content)

Obligatorio. El contenido de la conversación actual con el modelo.

Para consultas de un solo turno, esta es una instancia única. Para las consultas de varios turnos, como chat, este es un campo repetido que contiene el historial de conversaciones y la solicitud más reciente.

tools[] object (Tool)

Opcional. Es una lista de Tools que el Model puede usar para generar la siguiente respuesta.

Una Tool es un fragmento de código que permite que el sistema interactúe con sistemas externos para realizar una acción, o un conjunto de acciones, fuera del conocimiento y del alcance del Model. Los Tool admitidos son Function y codeExecution. Consulta las guías de Llamada a función y Ejecución de código para obtener más información.

toolConfig object (ToolConfig)

Opcional. Es la configuración de la herramienta para cualquier Tool especificado en la solicitud. Consulta la guía de llamadas a funciones para ver un ejemplo de uso.

safetySettings[] object (SafetySetting)

Opcional. Es una lista de instancias SafetySetting únicas para bloquear contenido no seguro.

Esto se aplicará en GenerateContentRequest.contents y GenerateContentResponse.candidates. No debe haber más de un parámetro de configuración para cada tipo de SafetyCategory. La API bloqueará todo el contenido y las respuestas que no cumplan con los umbrales establecidos por estos parámetros de configuración. Esta lista anula la configuración predeterminada de cada SafetyCategory especificado en safetySettings. Si no hay un SafetySetting para un SafetyCategory determinado proporcionado en la lista, la API usará el parámetro de configuración de seguridad predeterminado para esa categoría. Se admiten las categorías de daño HARM_CATEGORY_HATE_SPEECH, HARM_CATEGORY_SEXUALLY_EXPLICIT, HARM_CATEGORY_DANGEROUS_CONTENT, HARM_CATEGORY_HARASSMENT y HARM_CATEGORY_CIVIC_INTEGRITY. Consulta la guía para obtener información detallada sobre los parámetros de configuración de seguridad disponibles. También consulta la Guía de seguridad para obtener información sobre cómo incorporar consideraciones de seguridad en tus aplicaciones de IA.

systemInstruction object (Content)

Opcional. El desarrollador establece instrucciones del sistema. Actualmente, solo texto.

generationConfig object (GenerationConfig)

Opcional. Son las opciones de configuración para la generación y los resultados del modelo.

cachedContent string

Opcional. Nombre del contenido almacenado en caché que se usará como contexto para entregar la predicción. Formato: cachedContents/{cachedContent}

Ejemplo de solicitud

Texto

Python

from google import genai

client = genai.Client()
response = client.models.generate_content(
    model="gemini-2.0-flash", contents="Write a story about a magic backpack."
)
print(response.text)text_generation.py

Node.js

// Make sure to include the following import:
// import {GoogleGenAI} from '@google/genai';
const ai = new GoogleGenAI({ apiKey: process.env.GEMINI_API_KEY });

const response = await ai.models.generateContent({
  model: "gemini-2.0-flash",
  contents: "Write a story about a magic backpack.",
});
console.log(response.text);text_generation.js

Go

ctx := context.Background()
client, err := genai.NewClient(ctx, &genai.ClientConfig{
	APIKey:  os.Getenv("GEMINI_API_KEY"),
	Backend: genai.BackendGeminiAPI,
})
if err != nil {
	log.Fatal(err)
}
contents := []*genai.Content{
	genai.NewContentFromText("Write a story about a magic backpack.", genai.RoleUser),
}
response, err := client.Models.GenerateContent(ctx, "gemini-2.0-flash", contents, nil)
if err != nil {
	log.Fatal(err)
}
printResponse(response)text_generation.go

Almeja

curl "https://generativelanguage.googleapis.com/v1beta/models/gemini-2.0-flash:generateContent?key=$GEMINI_API_KEY" \
    -H 'Content-Type: application/json' \
    -X POST \
    -d '{
      "contents": [{
        "parts":[{"text": "Write a story about a magic backpack."}]
        }]
       }' 2> /dev/nulltext_generation.sh

Java

Client client = new Client();

GenerateContentResponse response =
        client.models.generateContent(
                "gemini-2.0-flash",
                "Write a story about a magic backpack.",
                null);

System.out.println(response.text());TextGeneration.java

Imagen

Python

from google import genai
import PIL.Image

client = genai.Client()
organ = PIL.Image.open(media / "organ.jpg")
response = client.models.generate_content(
    model="gemini-2.0-flash", contents=["Tell me about this instrument", organ]
)
print(response.text)text_generation.py

Node.js

// Make sure to include the following import:
// import {GoogleGenAI} from '@google/genai';
const ai = new GoogleGenAI({ apiKey: process.env.GEMINI_API_KEY });

const organ = await ai.files.upload({
  file: path.join(media, "organ.jpg"),
});

const response = await ai.models.generateContent({
  model: "gemini-2.0-flash",
  contents: [
    createUserContent([
      "Tell me about this instrument", 
      createPartFromUri(organ.uri, organ.mimeType)
    ]),
  ],
});
console.log(response.text);text_generation.js

Go

ctx := context.Background()
client, err := genai.NewClient(ctx, &genai.ClientConfig{
	APIKey:  os.Getenv("GEMINI_API_KEY"),
	Backend: genai.BackendGeminiAPI,
})
if err != nil {
	log.Fatal(err)
}

file, err := client.Files.UploadFromPath(
	ctx, 
	filepath.Join(getMedia(), "organ.jpg"), 
	&genai.UploadFileConfig{
		MIMEType : "image/jpeg",
	},
)
if err != nil {
	log.Fatal(err)
}
parts := []*genai.Part{
	genai.NewPartFromText("Tell me about this instrument"),
	genai.NewPartFromURI(file.URI, file.MIMEType),
}
contents := []*genai.Content{
	genai.NewContentFromParts(parts, genai.RoleUser),
}

response, err := client.Models.GenerateContent(ctx, "gemini-2.0-flash", contents, nil)
if err != nil {
	log.Fatal(err)
}
printResponse(response)text_generation.go

Almeja

# Use a temporary file to hold the base64 encoded image data
TEMP_B64=$(mktemp)
trap 'rm -f "$TEMP_B64"' EXIT
base64 $B64FLAGS $IMG_PATH > "$TEMP_B64"

# Use a temporary file to hold the JSON payload
TEMP_JSON=$(mktemp)
trap 'rm -f "$TEMP_JSON"' EXIT

cat > "$TEMP_JSON" << EOF
{
  "contents": [{
    "parts":[
      {"text": "Tell me about this instrument"},
      {
        "inline_data": {
          "mime_type":"image/jpeg",
          "data": "$(cat "$TEMP_B64")"
        }
      }
    ]
  }]
}
EOF

curl "https://generativelanguage.googleapis.com/v1beta/models/gemini-2.0-flash:generateContent?key=$GEMINI_API_KEY" \
    -H 'Content-Type: application/json' \
    -X POST \
    -d "@$TEMP_JSON" 2> /dev/nulltext_generation.sh

Java

Client client = new Client();

String path = media_path + "organ.jpg";
byte[] imageData = Files.readAllBytes(Paths.get(path));

Content content =
        Content.fromParts(
                Part.fromText("Tell me about this instrument."),
                Part.fromBytes(imageData, "image/jpeg"));

GenerateContentResponse response = client.models.generateContent("gemini-2.0-flash", content, null);

System.out.println(response.text());TextGeneration.java

Audio

Python

from google import genai

client = genai.Client()
sample_audio = client.files.upload(file=media / "sample.mp3")
response = client.models.generate_content(
    model="gemini-2.0-flash",
    contents=["Give me a summary of this audio file.", sample_audio],
)
print(response.text)text_generation.py

Node.js

// Make sure to include the following import:
// import {GoogleGenAI} from '@google/genai';
const ai = new GoogleGenAI({ apiKey: process.env.GEMINI_API_KEY });

const audio = await ai.files.upload({
  file: path.join(media, "sample.mp3"),
});

const response = await ai.models.generateContent({
  model: "gemini-2.0-flash",
  contents: [
    createUserContent([
      "Give me a summary of this audio file.",
      createPartFromUri(audio.uri, audio.mimeType),
    ]),
  ],
});
console.log(response.text);text_generation.js

Go

ctx := context.Background()
client, err := genai.NewClient(ctx, &genai.ClientConfig{
	APIKey:  os.Getenv("GEMINI_API_KEY"),
	Backend: genai.BackendGeminiAPI,
})
if err != nil {
	log.Fatal(err)
}

file, err := client.Files.UploadFromPath(
	ctx, 
	filepath.Join(getMedia(), "sample.mp3"), 
	&genai.UploadFileConfig{
		MIMEType : "audio/mpeg",
	},
)
if err != nil {
	log.Fatal(err)
}

parts := []*genai.Part{
	genai.NewPartFromText("Give me a summary of this audio file."),
	genai.NewPartFromURI(file.URI, file.MIMEType),
}

contents := []*genai.Content{
	genai.NewContentFromParts(parts, genai.RoleUser),
}

response, err := client.Models.GenerateContent(ctx, "gemini-2.0-flash", contents, nil)
if err != nil {
	log.Fatal(err)
}
printResponse(response)text_generation.go

Almeja

# Use File API to upload audio data to API request.
MIME_TYPE=$(file -b --mime-type "${AUDIO_PATH}")
NUM_BYTES=$(wc -c < "${AUDIO_PATH}")
DISPLAY_NAME=AUDIO

tmp_header_file=upload-header.tmp

# Initial resumable request defining metadata.
# The upload url is in the response headers dump them to a file.
curl "${BASE_URL}/upload/v1beta/files?key=${GEMINI_API_KEY}" \
  -D upload-header.tmp \
  -H "X-Goog-Upload-Protocol: resumable" \
  -H "X-Goog-Upload-Command: start" \
  -H "X-Goog-Upload-Header-Content-Length: ${NUM_BYTES}" \
  -H "X-Goog-Upload-Header-Content-Type: ${MIME_TYPE}" \
  -H "Content-Type: application/json" \
  -d "{'file': {'display_name': '${DISPLAY_NAME}'}}" 2> /dev/null

upload_url=$(grep -i "x-goog-upload-url: " "${tmp_header_file}" | cut -d" " -f2 | tr -d "\r")
rm "${tmp_header_file}"

# Upload the actual bytes.
curl "${upload_url}" \
  -H "Content-Length: ${NUM_BYTES}" \
  -H "X-Goog-Upload-Offset: 0" \
  -H "X-Goog-Upload-Command: upload, finalize" \
  --data-binary "@${AUDIO_PATH}" 2> /dev/null > file_info.json

file_uri=$(jq ".file.uri" file_info.json)
echo file_uri=$file_uri

curl "https://generativelanguage.googleapis.com/v1beta/models/gemini-2.0-flash:generateContent?key=$GEMINI_API_KEY" \
    -H 'Content-Type: application/json' \
    -X POST \
    -d '{
      "contents": [{
        "parts":[
          {"text": "Please describe this file."},
          {"file_data":{"mime_type": "audio/mpeg", "file_uri": '$file_uri'}}]
        }]
       }' 2> /dev/null > response.json

cat response.json
echo

jq ".candidates[].content.parts[].text" response.jsontext_generation.sh

Video

Python

from google import genai
import time

client = genai.Client()
# Video clip (CC BY 3.0) from https://peach.blender.org/download/
myfile = client.files.upload(file=media / "Big_Buck_Bunny.mp4")
print(f"{myfile=}")

# Poll until the video file is completely processed (state becomes ACTIVE).
while not myfile.state or myfile.state.name != "ACTIVE":
    print("Processing video...")
    print("File state:", myfile.state)
    time.sleep(5)
    myfile = client.files.get(name=myfile.name)

response = client.models.generate_content(
    model="gemini-2.0-flash", contents=[myfile, "Describe this video clip"]
)
print(f"{response.text=}")text_generation.py

Node.js

// Make sure to include the following import:
// import {GoogleGenAI} from '@google/genai';
const ai = new GoogleGenAI({ apiKey: process.env.GEMINI_API_KEY });

let video = await ai.files.upload({
  file: path.join(media, 'Big_Buck_Bunny.mp4'),
});

// Poll until the video file is completely processed (state becomes ACTIVE).
while (!video.state || video.state.toString() !== 'ACTIVE') {
  console.log('Processing video...');
  console.log('File state: ', video.state);
  await sleep(5000);
  video = await ai.files.get({name: video.name});
}

const response = await ai.models.generateContent({
  model: "gemini-2.0-flash",
  contents: [
    createUserContent([
      "Describe this video clip",
      createPartFromUri(video.uri, video.mimeType),
    ]),
  ],
});
console.log(response.text);text_generation.js

Go

ctx := context.Background()
client, err := genai.NewClient(ctx, &genai.ClientConfig{
	APIKey:  os.Getenv("GEMINI_API_KEY"),
	Backend: genai.BackendGeminiAPI,
})
if err != nil {
	log.Fatal(err)
}

file, err := client.Files.UploadFromPath(
	ctx, 
	filepath.Join(getMedia(), "Big_Buck_Bunny.mp4"), 
	&genai.UploadFileConfig{
		MIMEType : "video/mp4",
	},
)
if err != nil {
	log.Fatal(err)
}

// Poll until the video file is completely processed (state becomes ACTIVE).
for file.State == genai.FileStateUnspecified || file.State != genai.FileStateActive {
	fmt.Println("Processing video...")
	fmt.Println("File state:", file.State)
	time.Sleep(5 * time.Second)

	file, err = client.Files.Get(ctx, file.Name, nil)
	if err != nil {
		log.Fatal(err)
	}
}

parts := []*genai.Part{
	genai.NewPartFromText("Describe this video clip"),
	genai.NewPartFromURI(file.URI, file.MIMEType),
}

contents := []*genai.Content{
	genai.NewContentFromParts(parts, genai.RoleUser),
}

response, err := client.Models.GenerateContent(ctx, "gemini-2.0-flash", contents, nil)
if err != nil {
	log.Fatal(err)
}
printResponse(response)text_generation.go

Almeja

# Use File API to upload audio data to API request.
MIME_TYPE=$(file -b --mime-type "${VIDEO_PATH}")
NUM_BYTES=$(wc -c < "${VIDEO_PATH}")
DISPLAY_NAME=VIDEO

# Initial resumable request defining metadata.
# The upload url is in the response headers dump them to a file.
curl "${BASE_URL}/upload/v1beta/files?key=${GEMINI_API_KEY}" \
  -D "${tmp_header_file}" \
  -H "X-Goog-Upload-Protocol: resumable" \
  -H "X-Goog-Upload-Command: start" \
  -H "X-Goog-Upload-Header-Content-Length: ${NUM_BYTES}" \
  -H "X-Goog-Upload-Header-Content-Type: ${MIME_TYPE}" \
  -H "Content-Type: application/json" \
  -d "{'file': {'display_name': '${DISPLAY_NAME}'}}" 2> /dev/null

upload_url=$(grep -i "x-goog-upload-url: " "${tmp_header_file}" | cut -d" " -f2 | tr -d "\r")
rm "${tmp_header_file}"

# Upload the actual bytes.
curl "${upload_url}" \
  -H "Content-Length: ${NUM_BYTES}" \
  -H "X-Goog-Upload-Offset: 0" \
  -H "X-Goog-Upload-Command: upload, finalize" \
  --data-binary "@${VIDEO_PATH}" 2> /dev/null > file_info.json

file_uri=$(jq ".file.uri" file_info.json)
echo file_uri=$file_uri

state=$(jq ".file.state" file_info.json)
echo state=$state

name=$(jq ".file.name" file_info.json)
echo name=$name

while [[ "($state)" = *"PROCESSING"* ]];
do
  echo "Processing video..."
  sleep 5
  # Get the file of interest to check state
  curl https://generativelanguage.googleapis.com/v1beta/files/$name > file_info.json
  state=$(jq ".file.state" file_info.json)
done

curl "https://generativelanguage.googleapis.com/v1beta/models/gemini-2.0-flash:generateContent?key=$GEMINI_API_KEY" \
    -H 'Content-Type: application/json' \
    -X POST \
    -d '{
      "contents": [{
        "parts":[
          {"text": "Transcribe the audio from this video, giving timestamps for salient events in the video. Also provide visual descriptions."},
          {"file_data":{"mime_type": "video/mp4", "file_uri": '$file_uri'}}]
        }]
       }' 2> /dev/null > response.json

cat response.json
echo

jq ".candidates[].content.parts[].text" response.jsontext_generation.sh

PDF

Python

from google import genai

client = genai.Client()
sample_pdf = client.files.upload(file=media / "test.pdf")
response = client.models.generate_content(
    model="gemini-2.0-flash",
    contents=["Give me a summary of this document:", sample_pdf],
)
print(f"{response.text=}")text_generation.py

Go

ctx := context.Background()
client, err := genai.NewClient(ctx, &genai.ClientConfig{
	APIKey:  os.Getenv("GEMINI_API_KEY"),
	Backend: genai.BackendGeminiAPI,
})
if err != nil {
	log.Fatal(err)
}

file, err := client.Files.UploadFromPath(
	ctx, 
	filepath.Join(getMedia(), "test.pdf"), 
	&genai.UploadFileConfig{
		MIMEType : "application/pdf",
	},
)
if err != nil {
	log.Fatal(err)
}

parts := []*genai.Part{
	genai.NewPartFromText("Give me a summary of this document:"),
	genai.NewPartFromURI(file.URI, file.MIMEType),
}

contents := []*genai.Content{
	genai.NewContentFromParts(parts, genai.RoleUser),
}

response, err := client.Models.GenerateContent(ctx, "gemini-2.0-flash", contents, nil)
if err != nil {
	log.Fatal(err)
}
printResponse(response)text_generation.go

Almeja

MIME_TYPE=$(file -b --mime-type "${PDF_PATH}")
NUM_BYTES=$(wc -c < "${PDF_PATH}")
DISPLAY_NAME=TEXT


echo $MIME_TYPE
tmp_header_file=upload-header.tmp

# Initial resumable request defining metadata.
# The upload url is in the response headers dump them to a file.
curl "${BASE_URL}/upload/v1beta/files?key=${GEMINI_API_KEY}" \
  -D upload-header.tmp \
  -H "X-Goog-Upload-Protocol: resumable" \
  -H "X-Goog-Upload-Command: start" \
  -H "X-Goog-Upload-Header-Content-Length: ${NUM_BYTES}" \
  -H "X-Goog-Upload-Header-Content-Type: ${MIME_TYPE}" \
  -H "Content-Type: application/json" \
  -d "{'file': {'display_name': '${DISPLAY_NAME}'}}" 2> /dev/null

upload_url=$(grep -i "x-goog-upload-url: " "${tmp_header_file}" | cut -d" " -f2 | tr -d "\r")
rm "${tmp_header_file}"

# Upload the actual bytes.
curl "${upload_url}" \
  -H "Content-Length: ${NUM_BYTES}" \
  -H "X-Goog-Upload-Offset: 0" \
  -H "X-Goog-Upload-Command: upload, finalize" \
  --data-binary "@${PDF_PATH}" 2> /dev/null > file_info.json

file_uri=$(jq ".file.uri" file_info.json)
echo file_uri=$file_uri

# Now generate content using that file
curl "https://generativelanguage.googleapis.com/v1beta/models/gemini-2.0-flash:generateContent?key=$GEMINI_API_KEY" \
    -H 'Content-Type: application/json' \
    -X POST \
    -d '{
      "contents": [{
        "parts":[
          {"text": "Can you add a few more lines to this poem?"},
          {"file_data":{"mime_type": "application/pdf", "file_uri": '$file_uri'}}]
        }]
       }' 2> /dev/null > response.json

cat response.json
echo

jq ".candidates[].content.parts[].text" response.jsontext_generation.sh

Chat

Python

from google import genai
from google.genai import types

client = genai.Client()
# Pass initial history using the "history" argument
chat = client.chats.create(
    model="gemini-2.0-flash",
    history=[
        types.Content(role="user", parts=[types.Part(text="Hello")]),
        types.Content(
            role="model",
            parts=[
                types.Part(
                    text="Great to meet you. What would you like to know?"
                )
            ],
        ),
    ],
)
response = chat.send_message(message="I have 2 dogs in my house.")
print(response.text)
response = chat.send_message(message="How many paws are in my house?")
print(response.text)chat.py

Node.js

// Make sure to include the following import:
// import {GoogleGenAI} from '@google/genai';
const ai = new GoogleGenAI({ apiKey: process.env.GEMINI_API_KEY });
const chat = ai.chats.create({
  model: "gemini-2.0-flash",
  history: [
    {
      role: "user",
      parts: [{ text: "Hello" }],
    },
    {
      role: "model",
      parts: [{ text: "Great to meet you. What would you like to know?" }],
    },
  ],
});

const response1 = await chat.sendMessage({
  message: "I have 2 dogs in my house.",
});
console.log("Chat response 1:", response1.text);

const response2 = await chat.sendMessage({
  message: "How many paws are in my house?",
});
console.log("Chat response 2:", response2.text);chat.js

Go

ctx := context.Background()
client, err := genai.NewClient(ctx, &genai.ClientConfig{
	APIKey:  os.Getenv("GEMINI_API_KEY"),
	Backend: genai.BackendGeminiAPI,
})
if err != nil {
	log.Fatal(err)
}

// Pass initial history using the History field.
history := []*genai.Content{
	genai.NewContentFromText("Hello", genai.RoleUser),
	genai.NewContentFromText("Great to meet you. What would you like to know?", genai.RoleModel),
}

chat, err := client.Chats.Create(ctx, "gemini-2.0-flash", nil, history)
if err != nil {
	log.Fatal(err)
}

firstResp, err := chat.SendMessage(ctx, genai.Part{Text: "I have 2 dogs in my house."})
if err != nil {
	log.Fatal(err)
}
fmt.Println(firstResp.Text())

secondResp, err := chat.SendMessage(ctx, genai.Part{Text: "How many paws are in my house?"})
if err != nil {
	log.Fatal(err)
}
fmt.Println(secondResp.Text())chat.go

Almeja

curl https://generativelanguage.googleapis.com/v1beta/models/gemini-2.0-flash:generateContent?key=$GEMINI_API_KEY \
    -H 'Content-Type: application/json' \
    -X POST \
    -d '{
      "contents": [
        {"role":"user",
         "parts":[{
           "text": "Hello"}]},
        {"role": "model",
         "parts":[{
           "text": "Great to meet you. What would you like to know?"}]},
        {"role":"user",
         "parts":[{
           "text": "I have two dogs in my house. How many paws are in my house?"}]},
      ]
    }' 2> /dev/null | grep "text"chat.sh

Java

Client client = new Client();

Content userContent = Content.fromParts(Part.fromText("Hello"));
Content modelContent =
        Content.builder()
                .role("model")
                .parts(
                        Collections.singletonList(
                                Part.fromText("Great to meet you. What would you like to know?")
                        )
                ).build();

Chat chat = client.chats.create(
        "gemini-2.0-flash",
        GenerateContentConfig.builder()
                .systemInstruction(userContent)
                .systemInstruction(modelContent)
                .build()
);

GenerateContentResponse response1 = chat.sendMessage("I have 2 dogs in my house.");
System.out.println(response1.text());

GenerateContentResponse response2 = chat.sendMessage("How many paws are in my house?");
System.out.println(response2.text());
ChatSession.java

Caché

Python

from google import genai
from google.genai import types

client = genai.Client()
document = client.files.upload(file=media / "a11.txt")
model_name = "gemini-1.5-flash-001"

cache = client.caches.create(
    model=model_name,
    config=types.CreateCachedContentConfig(
        contents=[document],
        system_instruction="You are an expert analyzing transcripts.",
    ),
)
print(cache)

response = client.models.generate_content(
    model=model_name,
    contents="Please summarize this transcript",
    config=types.GenerateContentConfig(cached_content=cache.name),
)
print(response.text)cache.py

Node.js

// Make sure to include the following import:
// import {GoogleGenAI} from '@google/genai';
const ai = new GoogleGenAI({ apiKey: process.env.GEMINI_API_KEY });
const filePath = path.join(media, "a11.txt");
const document = await ai.files.upload({
  file: filePath,
  config: { mimeType: "text/plain" },
});
console.log("Uploaded file name:", document.name);
const modelName = "gemini-1.5-flash-001";

const contents = [
  createUserContent(createPartFromUri(document.uri, document.mimeType)),
];

const cache = await ai.caches.create({
  model: modelName,
  config: {
    contents: contents,
    systemInstruction: "You are an expert analyzing transcripts.",
  },
});
console.log("Cache created:", cache);

const response = await ai.models.generateContent({
  model: modelName,
  contents: "Please summarize this transcript",
  config: { cachedContent: cache.name },
});
console.log("Response text:", response.text);cache.js

Go

ctx := context.Background()
client, err := genai.NewClient(ctx, &genai.ClientConfig{
	APIKey:  os.Getenv("GEMINI_API_KEY"), 
	Backend: genai.BackendGeminiAPI,
})
if err != nil {
	log.Fatal(err)
}

modelName := "gemini-1.5-flash-001"
document, err := client.Files.UploadFromPath(
	ctx, 
	filepath.Join(getMedia(), "a11.txt"), 
	&genai.UploadFileConfig{
		MIMEType : "text/plain",
	},
)
if err != nil {
	log.Fatal(err)
}
parts := []*genai.Part{
	genai.NewPartFromURI(document.URI, document.MIMEType),
}
contents := []*genai.Content{
	genai.NewContentFromParts(parts, genai.RoleUser),
}
cache, err := client.Caches.Create(ctx, modelName, &genai.CreateCachedContentConfig{
	Contents: contents,
	SystemInstruction: genai.NewContentFromText(
		"You are an expert analyzing transcripts.", genai.RoleUser,
	),
})
if err != nil {
	log.Fatal(err)
}
fmt.Println("Cache created:")
fmt.Println(cache)

// Use the cache for generating content.
response, err := client.Models.GenerateContent(
	ctx,
	modelName,
	genai.Text("Please summarize this transcript"),
	&genai.GenerateContentConfig{
		CachedContent: cache.Name,
	},
)
if err != nil {
	log.Fatal(err)
}
printResponse(response)cache.go

Modelo ajustado

Python

# With Gemini 2 we're launching a new SDK. See the following doc for details.
# https://ai.google.dev/gemini-api/docs/migrateREADME.md

Modo JSON

Python

from google import genai
from google.genai import types
from typing_extensions import TypedDict

class Recipe(TypedDict):
    recipe_name: str
    ingredients: list[str]

client = genai.Client()
result = client.models.generate_content(
    model="gemini-2.0-flash",
    contents="List a few popular cookie recipes.",
    config=types.GenerateContentConfig(
        response_mime_type="application/json", response_schema=list[Recipe]
    ),
)
print(result)controlled_generation.py

Node.js

// Make sure to include the following import:
// import {GoogleGenAI} from '@google/genai';
const ai = new GoogleGenAI({ apiKey: process.env.GEMINI_API_KEY });
const response = await ai.models.generateContent({
  model: "gemini-2.0-flash",
  contents: "List a few popular cookie recipes.",
  config: {
    responseMimeType: "application/json",
    responseSchema: {
      type: "array",
      items: {
        type: "object",
        properties: {
          recipeName: { type: "string" },
          ingredients: { type: "array", items: { type: "string" } },
        },
        required: ["recipeName", "ingredients"],
      },
    },
  },
});
console.log(response.text);controlled_generation.js

Go

ctx := context.Background()
client, err := genai.NewClient(ctx, &genai.ClientConfig{
	APIKey:  os.Getenv("GEMINI_API_KEY"), 
	Backend: genai.BackendGeminiAPI,
})
if err != nil {
	log.Fatal(err)
}

schema := &genai.Schema{
	Type: genai.TypeArray,
	Items: &genai.Schema{
		Type: genai.TypeObject,
		Properties: map[string]*genai.Schema{
			"recipe_name": {Type: genai.TypeString},
			"ingredients": {
				Type:  genai.TypeArray,
				Items: &genai.Schema{Type: genai.TypeString},
			},
		},
		Required: []string{"recipe_name"},
	},
}

config := &genai.GenerateContentConfig{
	ResponseMIMEType: "application/json",
	ResponseSchema:   schema,
}

response, err := client.Models.GenerateContent(
	ctx,
	"gemini-2.0-flash",
	genai.Text("List a few popular cookie recipes."),
	config,
)
if err != nil {
	log.Fatal(err)
}
printResponse(response)controlled_generation.go

Almeja

curl "https://generativelanguage.googleapis.com/v1beta/models/gemini-2.0-flash:generateContent?key=$GEMINI_API_KEY" \
-H 'Content-Type: application/json' \
-d '{
    "contents": [{
      "parts":[
        {"text": "List 5 popular cookie recipes"}
        ]
    }],
    "generationConfig": {
        "response_mime_type": "application/json",
        "response_schema": {
          "type": "ARRAY",
          "items": {
            "type": "OBJECT",
            "properties": {
              "recipe_name": {"type":"STRING"},
            }
          }
        }
    }
}' 2> /dev/null | headcontrolled_generation.sh

Java

Client client = new Client();

Schema recipeSchema = Schema.builder()
        .type(Array.class.getSimpleName())
        .items(Schema.builder()
                .type(Object.class.getSimpleName())
                .properties(
                        Map.of("recipe_name", Schema.builder()
                                        .type(String.class.getSimpleName())
                                        .build(),
                                "ingredients", Schema.builder()
                                        .type(Array.class.getSimpleName())
                                        .items(Schema.builder()
                                                .type(String.class.getSimpleName())
                                                .build())
                                        .build())
                )
                .required(List.of("recipe_name", "ingredients"))
                .build())
        .build();

GenerateContentConfig config =
        GenerateContentConfig.builder()
                .responseMimeType("application/json")
                .responseSchema(recipeSchema)
                .build();

GenerateContentResponse response =
        client.models.generateContent(
                "gemini-2.0-flash",
                "List a few popular cookie recipes.",
                config);

System.out.println(response.text());ControlledGeneration.java

Ejecución de código

Python

from google import genai
from google.genai import types

client = genai.Client()
response = client.models.generate_content(
    model="gemini-2.0-pro-exp-02-05",
    contents=(
        "Write and execute code that calculates the sum of the first 50 prime numbers. "
        "Ensure that only the executable code and its resulting output are generated."
    ),
)
# Each part may contain text, executable code, or an execution result.
for part in response.candidates[0].content.parts:
    print(part, "\n")

print("-" * 80)
# The .text accessor concatenates the parts into a markdown-formatted text.
print("\n", response.text)code_execution.py

Go

ctx := context.Background()
client, err := genai.NewClient(ctx, &genai.ClientConfig{
	APIKey:  os.Getenv("GEMINI_API_KEY"),
	Backend: genai.BackendGeminiAPI,
})
if err != nil {
	log.Fatal(err)
}

response, err := client.Models.GenerateContent(
	ctx,
	"gemini-2.0-pro-exp-02-05",
	genai.Text(
		`Write and execute code that calculates the sum of the first 50 prime numbers.
		 Ensure that only the executable code and its resulting output are generated.`,
	),
	&genai.GenerateContentConfig{},
)
if err != nil {
	log.Fatal(err)
}

// Print the response.
printResponse(response)

fmt.Println("--------------------------------------------------------------------------------")
fmt.Println(response.Text())code_execution.go

Java

Client client = new Client();

String prompt = """
        Write and execute code that calculates the sum of the first 50 prime numbers.
        Ensure that only the executable code and its resulting output are generated.
        """;

GenerateContentResponse response =
        client.models.generateContent(
                "gemini-2.0-pro-exp-02-05",
                prompt,
                null);

for (Part part : response.candidates().get().getFirst().content().get().parts().get()) {
    System.out.println(part + "\n");
}

System.out.println("-".repeat(80));
System.out.println(response.text());CodeExecution.java

Llamada a función

Python

from google import genai
from google.genai import types

client = genai.Client()

def add(a: float, b: float) -> float:
    """returns a + b."""
    return a + b

def subtract(a: float, b: float) -> float:
    """returns a - b."""
    return a - b

def multiply(a: float, b: float) -> float:
    """returns a * b."""
    return a * b

def divide(a: float, b: float) -> float:
    """returns a / b."""
    return a / b

# Create a chat session; function calling (via tools) is enabled in the config.
chat = client.chats.create(
    model="gemini-2.0-flash",
    config=types.GenerateContentConfig(tools=[add, subtract, multiply, divide]),
)
response = chat.send_message(
    message="I have 57 cats, each owns 44 mittens, how many mittens is that in total?"
)
print(response.text)function_calling.py

Go

ctx := context.Background()
client, err := genai.NewClient(ctx, &genai.ClientConfig{
	APIKey:  os.Getenv("GEMINI_API_KEY"),
	Backend: genai.BackendGeminiAPI,
})
if err != nil {
	log.Fatal(err)
}
modelName := "gemini-2.0-flash"

// Create the function declarations for arithmetic operations.
addDeclaration := createArithmeticToolDeclaration("addNumbers", "Return the result of adding two numbers.")
subtractDeclaration := createArithmeticToolDeclaration("subtractNumbers", "Return the result of subtracting the second number from the first.")
multiplyDeclaration := createArithmeticToolDeclaration("multiplyNumbers", "Return the product of two numbers.")
divideDeclaration := createArithmeticToolDeclaration("divideNumbers", "Return the quotient of dividing the first number by the second.")

// Group the function declarations as a tool.
tools := []*genai.Tool{
	{
		FunctionDeclarations: []*genai.FunctionDeclaration{
			addDeclaration,
			subtractDeclaration,
			multiplyDeclaration,
			divideDeclaration,
		},
	},
}

// Create the content prompt.
contents := []*genai.Content{
	genai.NewContentFromText(
		"I have 57 cats, each owns 44 mittens, how many mittens is that in total?", genai.RoleUser,
	),
}

// Set up the generate content configuration with function calling enabled.
config := &genai.GenerateContentConfig{
	Tools: tools,
	ToolConfig: &genai.ToolConfig{
		FunctionCallingConfig: &genai.FunctionCallingConfig{
			// The mode equivalent to FunctionCallingConfigMode.ANY in JS.
			Mode: genai.FunctionCallingConfigModeAny,
		},
	},
}

genContentResp, err := client.Models.GenerateContent(ctx, modelName, contents, config)
if err != nil {
	log.Fatal(err)
}

// Assume the response includes a list of function calls.
if len(genContentResp.FunctionCalls()) == 0 {
	log.Println("No function call returned from the AI.")
	return nil
}
functionCall := genContentResp.FunctionCalls()[0]
log.Printf("Function call: %+v\n", functionCall)

// Marshal the Args map into JSON bytes.
argsMap, err := json.Marshal(functionCall.Args)
if err != nil {
	log.Fatal(err)
}

// Unmarshal the JSON bytes into the ArithmeticArgs struct.
var args ArithmeticArgs
if err := json.Unmarshal(argsMap, &args); err != nil {
	log.Fatal(err)
}

// Map the function name to the actual arithmetic function.
var result float64
switch functionCall.Name {
	case "addNumbers":
		result = add(args.FirstParam, args.SecondParam)
	case "subtractNumbers":
		result = subtract(args.FirstParam, args.SecondParam)
	case "multiplyNumbers":
		result = multiply(args.FirstParam, args.SecondParam)
	case "divideNumbers":
		result = divide(args.FirstParam, args.SecondParam)
	default:
		return fmt.Errorf("unimplemented function: %s", functionCall.Name)
}
log.Printf("Function result: %v\n", result)

// Prepare the final result message as content.
resultContents := []*genai.Content{
	genai.NewContentFromText("The final result is " + fmt.Sprintf("%v", result), genai.RoleUser),
}

// Use GenerateContent to send the final result.
finalResponse, err := client.Models.GenerateContent(ctx, modelName, resultContents, &genai.GenerateContentConfig{})
if err != nil {
	log.Fatal(err)
}

printResponse(finalResponse)function_calling.go

Node.js

  // Make sure to include the following import:
  // import {GoogleGenAI} from '@google/genai';
  const ai = new GoogleGenAI({ apiKey: process.env.GEMINI_API_KEY });

  /**
   * The add function returns the sum of two numbers.
   * @param {number} a
   * @param {number} b
   * @returns {number}
   */
  function add(a, b) {
    return a + b;
  }

  /**
   * The subtract function returns the difference (a - b).
   * @param {number} a
   * @param {number} b
   * @returns {number}
   */
  function subtract(a, b) {
    return a - b;
  }

  /**
   * The multiply function returns the product of two numbers.
   * @param {number} a
   * @param {number} b
   * @returns {number}
   */
  function multiply(a, b) {
    return a * b;
  }

  /**
   * The divide function returns the quotient of a divided by b.
   * @param {number} a
   * @param {number} b
   * @returns {number}
   */
  function divide(a, b) {
    return a / b;
  }

  const addDeclaration = {
    name: "addNumbers",
    parameters: {
      type: "object",
      description: "Return the result of adding two numbers.",
      properties: {
        firstParam: {
          type: "number",
          description:
            "The first parameter which can be an integer or a floating point number.",
        },
        secondParam: {
          type: "number",
          description:
            "The second parameter which can be an integer or a floating point number.",
        },
      },
      required: ["firstParam", "secondParam"],
    },
  };

  const subtractDeclaration = {
    name: "subtractNumbers",
    parameters: {
      type: "object",
      description:
        "Return the result of subtracting the second number from the first.",
      properties: {
        firstParam: {
          type: "number",
          description: "The first parameter.",
        },
        secondParam: {
          type: "number",
          description: "The second parameter.",
        },
      },
      required: ["firstParam", "secondParam"],
    },
  };

  const multiplyDeclaration = {
    name: "multiplyNumbers",
    parameters: {
      type: "object",
      description: "Return the product of two numbers.",
      properties: {
        firstParam: {
          type: "number",
          description: "The first parameter.",
        },
        secondParam: {
          type: "number",
          description: "The second parameter.",
        },
      },
      required: ["firstParam", "secondParam"],
    },
  };

  const divideDeclaration = {
    name: "divideNumbers",
    parameters: {
      type: "object",
      description:
        "Return the quotient of dividing the first number by the second.",
      properties: {
        firstParam: {
          type: "number",
          description: "The first parameter.",
        },
        secondParam: {
          type: "number",
          description: "The second parameter.",
        },
      },
      required: ["firstParam", "secondParam"],
    },
  };

  // Step 1: Call generateContent with function calling enabled.
  const generateContentResponse = await ai.models.generateContent({
    model: "gemini-2.0-flash",
    contents:
      "I have 57 cats, each owns 44 mittens, how many mittens is that in total?",
    config: {
      toolConfig: {
        functionCallingConfig: {
          mode: FunctionCallingConfigMode.ANY,
        },
      },
      tools: [
        {
          functionDeclarations: [
            addDeclaration,
            subtractDeclaration,
            multiplyDeclaration,
            divideDeclaration,
          ],
        },
      ],
    },
  });

  // Step 2: Extract the function call.(
  // Assuming the response contains a 'functionCalls' array.
  const functionCall =
    generateContentResponse.functionCalls &&
    generateContentResponse.functionCalls[0];
  console.log(functionCall);

  // Parse the arguments.
  const args = functionCall.args;
  // Expected args format: { firstParam: number, secondParam: number }

  // Step 3: Invoke the actual function based on the function name.
  const functionMapping = {
    addNumbers: add,
    subtractNumbers: subtract,
    multiplyNumbers: multiply,
    divideNumbers: divide,
  };
  const func = functionMapping[functionCall.name];
  if (!func) {
    console.error("Unimplemented error:", functionCall.name);
    return generateContentResponse;
  }
  const resultValue = func(args.firstParam, args.secondParam);
  console.log("Function result:", resultValue);

  // Step 4: Use the chat API to send the result as the final answer.
  const chat = ai.chats.create({ model: "gemini-2.0-flash" });
  const chatResponse = await chat.sendMessage({
    message: "The final result is " + resultValue,
  });
  console.log(chatResponse.text);
  return chatResponse;
}
function_calling.js

Almeja


cat > tools.json << EOF
{
  "function_declarations": [
    {
      "name": "enable_lights",
      "description": "Turn on the lighting system."
    },
    {
      "name": "set_light_color",
      "description": "Set the light color. Lights must be enabled for this to work.",
      "parameters": {
        "type": "object",
        "properties": {
          "rgb_hex": {
            "type": "string",
            "description": "The light color as a 6-digit hex string, e.g. ff0000 for red."
          }
        },
        "required": [
          "rgb_hex"
        ]
      }
    },
    {
      "name": "stop_lights",
      "description": "Turn off the lighting system."
    }
  ]
} 
EOF

curl "https://generativelanguage.googleapis.com/v1beta/models/gemini-2.0-flash:generateContent?key=$GEMINI_API_KEY" \
  -H 'Content-Type: application/json' \
  -d @<(echo '
  {
    "system_instruction": {
      "parts": {
        "text": "You are a helpful lighting system bot. You can turn lights on and off, and you can set the color. Do not perform any other tasks."
      }
    },
    "tools": ['$(cat tools.json)'],

    "tool_config": {
      "function_calling_config": {"mode": "auto"}
    },

    "contents": {
      "role": "user",
      "parts": {
        "text": "Turn on the lights please."
      }
    }
  }
') 2>/dev/null |sed -n '/"content"/,/"finishReason"/p'function_calling.sh

Java

Client client = new Client();

FunctionDeclaration addFunction =
        FunctionDeclaration.builder()
                .name("addNumbers")
                .parameters(
                        Schema.builder()
                                .type("object")
                                .properties(Map.of(
                                        "firstParam", Schema.builder().type("number").description("First number").build(),
                                        "secondParam", Schema.builder().type("number").description("Second number").build()))
                                .required(Arrays.asList("firstParam", "secondParam"))
                                .build())
                .build();

FunctionDeclaration subtractFunction =
        FunctionDeclaration.builder()
                .name("subtractNumbers")
                .parameters(
                        Schema.builder()
                                .type("object")
                                .properties(Map.of(
                                        "firstParam", Schema.builder().type("number").description("First number").build(),
                                        "secondParam", Schema.builder().type("number").description("Second number").build()))
                                .required(Arrays.asList("firstParam", "secondParam"))
                                .build())
                .build();

FunctionDeclaration multiplyFunction =
        FunctionDeclaration.builder()
                .name("multiplyNumbers")
                .parameters(
                        Schema.builder()
                                .type("object")
                                .properties(Map.of(
                                        "firstParam", Schema.builder().type("number").description("First number").build(),
                                        "secondParam", Schema.builder().type("number").description("Second number").build()))
                                .required(Arrays.asList("firstParam", "secondParam"))
                                .build())
                .build();

FunctionDeclaration divideFunction =
        FunctionDeclaration.builder()
                .name("divideNumbers")
                .parameters(
                        Schema.builder()
                                .type("object")
                                .properties(Map.of(
                                        "firstParam", Schema.builder().type("number").description("First number").build(),
                                        "secondParam", Schema.builder().type("number").description("Second number").build()))
                                .required(Arrays.asList("firstParam", "secondParam"))
                                .build())
                .build();

GenerateContentConfig config = GenerateContentConfig.builder()
        .toolConfig(ToolConfig.builder().functionCallingConfig(
                FunctionCallingConfig.builder().mode("ANY").build()
        ).build())
        .tools(
                Collections.singletonList(
                        Tool.builder().functionDeclarations(
                                Arrays.asList(
                                        addFunction,
                                        subtractFunction,
                                        divideFunction,
                                        multiplyFunction
                                )
                        ).build()

                )
        )
        .build();

GenerateContentResponse response =
        client.models.generateContent(
                "gemini-2.0-flash",
                "I have 57 cats, each owns 44 mittens, how many mittens is that in total?",
                config);


if (response.functionCalls() == null || response.functionCalls().isEmpty()) {
    System.err.println("No function call received");
    return null;
}

var functionCall = response.functionCalls().getFirst();
String functionName = functionCall.name().get();
var arguments = functionCall.args();

Map<String, BiFunction<Double, Double, Double>> functionMapping = new HashMap<>();
functionMapping.put("addNumbers", (a, b) -> a + b);
functionMapping.put("subtractNumbers", (a, b) -> a - b);
functionMapping.put("multiplyNumbers", (a, b) -> a * b);
functionMapping.put("divideNumbers", (a, b) -> b != 0 ? a / b : Double.NaN);

BiFunction<Double, Double, Double> function = functionMapping.get(functionName);

Number firstParam = (Number) arguments.get().get("firstParam");
Number secondParam = (Number) arguments.get().get("secondParam");
Double result = function.apply(firstParam.doubleValue(), secondParam.doubleValue());

System.out.println(result);FunctionCalling.java

Configuración de generación

Python

from google import genai
from google.genai import types

client = genai.Client()
response = client.models.generate_content(
    model="gemini-2.0-flash",
    contents="Tell me a story about a magic backpack.",
    config=types.GenerateContentConfig(
        candidate_count=1,
        stop_sequences=["x"],
        max_output_tokens=20,
        temperature=1.0,
    ),
)
print(response.text)configure_model_parameters.py

Node.js

// Make sure to include the following import:
// import {GoogleGenAI} from '@google/genai';
const ai = new GoogleGenAI({ apiKey: process.env.GEMINI_API_KEY });

const response = await ai.models.generateContent({
  model: "gemini-2.0-flash",
  contents: "Tell me a story about a magic backpack.",
  config: {
    candidateCount: 1,
    stopSequences: ["x"],
    maxOutputTokens: 20,
    temperature: 1.0,
  },
});

console.log(response.text);configure_model_parameters.js

Go

ctx := context.Background()
client, err := genai.NewClient(ctx, &genai.ClientConfig{
	APIKey:  os.Getenv("GEMINI_API_KEY"),
	Backend: genai.BackendGeminiAPI,
})
if err != nil {
	log.Fatal(err)
}

// Create local variables for parameters.
candidateCount := int32(1)
maxOutputTokens := int32(20)
temperature := float32(1.0)

response, err := client.Models.GenerateContent(
	ctx,
	"gemini-2.0-flash",
	genai.Text("Tell me a story about a magic backpack."),
	&genai.GenerateContentConfig{
		CandidateCount:  candidateCount,
		StopSequences:   []string{"x"},
		MaxOutputTokens: maxOutputTokens,
		Temperature:     &temperature,
	},
)
if err != nil {
	log.Fatal(err)
}

printResponse(response)configure_model_parameters.go

Almeja

curl https://generativelanguage.googleapis.com/v1beta/models/gemini-2.0-flash:generateContent?key=$GEMINI_API_KEY \
    -H 'Content-Type: application/json' \
    -X POST \
    -d '{
        "contents": [{
            "parts":[
                {"text": "Explain how AI works"}
            ]
        }],
        "generationConfig": {
            "stopSequences": [
                "Title"
            ],
            "temperature": 1.0,
            "maxOutputTokens": 800,
            "topP": 0.8,
            "topK": 10
        }
    }'  2> /dev/null | grep "text"configure_model_parameters.sh

Java

Client client = new Client();

GenerateContentConfig config =
        GenerateContentConfig.builder()
                .candidateCount(1)
                .stopSequences(List.of("x"))
                .maxOutputTokens(20)
                .temperature(1.0F)
                .build();

GenerateContentResponse response =
        client.models.generateContent(
                "gemini-2.0-flash",
                "Tell me a story about a magic backpack.",
                config);

System.out.println(response.text());ConfigureModelParameters.java

Configuración de seguridad

Python

from google import genai
from google.genai import types

client = genai.Client()
unsafe_prompt = (
    "I support Martians Soccer Club and I think Jupiterians Football Club sucks! "
    "Write a ironic phrase about them including expletives."
)
response = client.models.generate_content(
    model="gemini-2.0-flash",
    contents=unsafe_prompt,
    config=types.GenerateContentConfig(
        safety_settings=[
            types.SafetySetting(
                category="HARM_CATEGORY_HATE_SPEECH",
                threshold="BLOCK_MEDIUM_AND_ABOVE",
            ),
            types.SafetySetting(
                category="HARM_CATEGORY_HARASSMENT", threshold="BLOCK_ONLY_HIGH"
            ),
        ]
    ),
)
try:
    print(response.text)
except Exception:
    print("No information generated by the model.")

print(response.candidates[0].safety_ratings)safety_settings.py

Node.js

  // Make sure to include the following import:
  // import {GoogleGenAI} from '@google/genai';
  const ai = new GoogleGenAI({ apiKey: process.env.GEMINI_API_KEY });
  const unsafePrompt =
    "I support Martians Soccer Club and I think Jupiterians Football Club sucks! Write a ironic phrase about them including expletives.";

  const response = await ai.models.generateContent({
    model: "gemini-2.0-flash",
    contents: unsafePrompt,
    config: {
      safetySettings: [
        {
          category: "HARM_CATEGORY_HATE_SPEECH",
          threshold: "BLOCK_MEDIUM_AND_ABOVE",
        },
        {
          category: "HARM_CATEGORY_HARASSMENT",
          threshold: "BLOCK_ONLY_HIGH",
        },
      ],
    },
  });

  try {
    console.log("Generated text:", response.text);
  } catch (error) {
    console.log("No information generated by the model.");
  }
  console.log("Safety ratings:", response.candidates[0].safetyRatings);
  return response;
}
safety_settings.js

Go

ctx := context.Background()
client, err := genai.NewClient(ctx, &genai.ClientConfig{
	APIKey:  os.Getenv("GEMINI_API_KEY"),
	Backend: genai.BackendGeminiAPI,
})
if err != nil {
	log.Fatal(err)
}

unsafePrompt := "I support Martians Soccer Club and I think Jupiterians Football Club sucks! " +
	"Write a ironic phrase about them including expletives."

config := &genai.GenerateContentConfig{
	SafetySettings: []*genai.SafetySetting{
		{
			Category:  "HARM_CATEGORY_HATE_SPEECH",
			Threshold: "BLOCK_MEDIUM_AND_ABOVE",
		},
		{
			Category:  "HARM_CATEGORY_HARASSMENT",
			Threshold: "BLOCK_ONLY_HIGH",
		},
	},
}
contents := []*genai.Content{
	genai.NewContentFromText(unsafePrompt, genai.RoleUser),
}
response, err := client.Models.GenerateContent(ctx, "gemini-2.0-flash", contents, config)
if err != nil {
	log.Fatal(err)
}

// Print the generated text.
text := response.Text()
fmt.Println("Generated text:", text)

// Print the and safety ratings from the first candidate.
if len(response.Candidates) > 0 {
	fmt.Println("Finish reason:", response.Candidates[0].FinishReason)
	safetyRatings, err := json.MarshalIndent(response.Candidates[0].SafetyRatings, "", "  ")
	if err != nil {
		return err
	}
	fmt.Println("Safety ratings:", string(safetyRatings))
} else {
	fmt.Println("No candidate returned.")
}safety_settings.go

Almeja

echo '{
    "safetySettings": [
        {"category": "HARM_CATEGORY_HARASSMENT", "threshold": "BLOCK_ONLY_HIGH"},
        {"category": "HARM_CATEGORY_HATE_SPEECH", "threshold": "BLOCK_MEDIUM_AND_ABOVE"}
    ],
    "contents": [{
        "parts":[{
            "text": "'I support Martians Soccer Club and I think Jupiterians Football Club sucks! Write a ironic phrase about them.'"}]}]}' > request.json

curl "https://generativelanguage.googleapis.com/v1beta/models/gemini-2.0-flash:generateContent?key=$GEMINI_API_KEY" \
    -H 'Content-Type: application/json' \
    -X POST \
    -d @request.json 2> /dev/nullsafety_settings.sh

Java

Client client = new Client();

String unsafePrompt = """
         I support Martians Soccer Club and I think Jupiterians Football Club sucks!
         Write a ironic phrase about them including expletives.
        """;

GenerateContentConfig config =
        GenerateContentConfig.builder()
                .safetySettings(Arrays.asList(
                        SafetySetting.builder()
                                .category("HARM_CATEGORY_HATE_SPEECH")
                                .threshold("BLOCK_MEDIUM_AND_ABOVE")
                                .build(),
                        SafetySetting.builder()
                                .category("HARM_CATEGORY_HARASSMENT")
                                .threshold("BLOCK_ONLY_HIGH")
                                .build()
                )).build();

GenerateContentResponse response =
        client.models.generateContent(
                "gemini-2.0-flash",
                unsafePrompt,
                config);

try {
    System.out.println(response.text());
} catch (Exception e) {
    System.out.println("No information generated by the model");
}

System.out.println(response.candidates().get().getFirst().safetyRatings());SafetySettings.java

Instrucción del sistema

Python

from google import genai
from google.genai import types

client = genai.Client()
response = client.models.generate_content(
    model="gemini-2.0-flash",
    contents="Good morning! How are you?",
    config=types.GenerateContentConfig(
        system_instruction="You are a cat. Your name is Neko."
    ),
)
print(response.text)system_instruction.py

Node.js

// Make sure to include the following import:
// import {GoogleGenAI} from '@google/genai';
const ai = new GoogleGenAI({ apiKey: process.env.GEMINI_API_KEY });
const response = await ai.models.generateContent({
  model: "gemini-2.0-flash",
  contents: "Good morning! How are you?",
  config: {
    systemInstruction: "You are a cat. Your name is Neko.",
  },
});
console.log(response.text);system_instruction.js

Go

ctx := context.Background()
client, err := genai.NewClient(ctx, &genai.ClientConfig{
	APIKey:  os.Getenv("GEMINI_API_KEY"),
	Backend: genai.BackendGeminiAPI,
})
if err != nil {
	log.Fatal(err)
}

// Construct the user message contents.
contents := []*genai.Content{
	genai.NewContentFromText("Good morning! How are you?", genai.RoleUser),
}

// Set the system instruction as a *genai.Content.
config := &genai.GenerateContentConfig{
	SystemInstruction: genai.NewContentFromText("You are a cat. Your name is Neko.", genai.RoleUser),
}

response, err := client.Models.GenerateContent(ctx, "gemini-2.0-flash", contents, config)
if err != nil {
	log.Fatal(err)
}
printResponse(response)system_instruction.go

Almeja

curl "https://generativelanguage.googleapis.com/v1beta/models/gemini-2.0-flash:generateContent?key=$GEMINI_API_KEY" \
-H 'Content-Type: application/json' \
-d '{ "system_instruction": {
    "parts":
      { "text": "You are a cat. Your name is Neko."}},
    "contents": {
      "parts": {
        "text": "Hello there"}}}'system_instruction.sh

Java

Client client = new Client();

Part textPart = Part.builder().text("You are a cat. Your name is Neko.").build();

Content content = Content.builder().role("system").parts(ImmutableList.of(textPart)).build();

GenerateContentConfig config = GenerateContentConfig.builder()
        .systemInstruction(content)
        .build();

GenerateContentResponse response =
        client.models.generateContent(
                "gemini-2.0-flash",
                "Good morning! How are you?",
                config);

System.out.println(response.text());SystemInstruction.java

Cuerpo de la respuesta

Si se ejecuta de forma correcta, el cuerpo de la respuesta incluye una instancia de GenerateContentResponse.

Método: models.streamGenerateContent

Extremo
Parámetros de ruta de acceso
Cuerpo de la solicitud
- Representación JSON
Cuerpo de la respuesta
Permisos de autorización
Ejemplo de solicitud
- Texto
- Imagen
- Audio
- Video
- PDF
- Chat

Genera una respuesta transmitida del modelo a partir de una entrada GenerateContentRequest.

Extremo

post https://generativelanguage.googleapis.com/v1beta/{model=models/*}:streamGenerateContent

Parámetros de ruta

model string

Obligatorio. Es el nombre del Model que se usará para generar la finalización.

Formato: models/{model}. Toma la forma models/{model}.

Cuerpo de la solicitud

El cuerpo de la solicitud contiene datos con la siguiente estructura:

Campos

contents[] object (Content)

Obligatorio. El contenido de la conversación actual con el modelo.

tools[] object (Tool)

Opcional. Es una lista de Tools que el Model puede usar para generar la siguiente respuesta.

toolConfig object (ToolConfig)

Opcional. Es la configuración de la herramienta para cualquier Tool especificado en la solicitud. Consulta la guía de llamadas a funciones para ver un ejemplo de uso.

safetySettings[] object (SafetySetting)

Opcional. Es una lista de instancias SafetySetting únicas para bloquear contenido no seguro.

systemInstruction object (Content)

Opcional. El desarrollador establece instrucciones del sistema. Actualmente, solo texto.

generationConfig object (GenerationConfig)

Opcional. Son las opciones de configuración para la generación y los resultados del modelo.

cachedContent string

Opcional. Nombre del contenido almacenado en caché que se usará como contexto para entregar la predicción. Formato: cachedContents/{cachedContent}

Ejemplo de solicitud

Texto

Python

from google import genai

client = genai.Client()
response = client.models.generate_content_stream(
    model="gemini-2.0-flash", contents="Write a story about a magic backpack."
)
for chunk in response:
    print(chunk.text)
    print("_" * 80)text_generation.py

Node.js

// Make sure to include the following import:
// import {GoogleGenAI} from '@google/genai';
const ai = new GoogleGenAI({ apiKey: process.env.GEMINI_API_KEY });

const response = await ai.models.generateContentStream({
  model: "gemini-2.0-flash",
  contents: "Write a story about a magic backpack.",
});
let text = "";
for await (const chunk of response) {
  console.log(chunk.text);
  text += chunk.text;
}text_generation.js

Go

ctx := context.Background()
client, err := genai.NewClient(ctx, &genai.ClientConfig{
	APIKey:  os.Getenv("GEMINI_API_KEY"),
	Backend: genai.BackendGeminiAPI,
})
if err != nil {
	log.Fatal(err)
}
contents := []*genai.Content{
	genai.NewContentFromText("Write a story about a magic backpack.", genai.RoleUser),
}
for response, err := range client.Models.GenerateContentStream(
	ctx,
	"gemini-2.0-flash",
	contents,
	nil,
) {
	if err != nil {
		log.Fatal(err)
	}
	fmt.Print(response.Candidates[0].Content.Parts[0].Text)
}text_generation.go

Almeja

curl "https://generativelanguage.googleapis.com/v1beta/models/gemini-2.0-flash:streamGenerateContent?alt=sse&key=${GEMINI_API_KEY}" \
        -H 'Content-Type: application/json' \
        --no-buffer \
        -d '{ "contents":[{"parts":[{"text": "Write a story about a magic backpack."}]}]}'text_generation.sh

Java

Client client = new Client();

ResponseStream<GenerateContentResponse> responseStream =
        client.models.generateContentStream(
                "gemini-2.0-flash",
                "Write a story about a magic backpack.",
                null);

StringBuilder response = new StringBuilder();
for (GenerateContentResponse res : responseStream) {
    System.out.print(res.text());
    response.append(res.text());
}

responseStream.close();TextGeneration.java

Imagen

Python

from google import genai
import PIL.Image

client = genai.Client()
organ = PIL.Image.open(media / "organ.jpg")
response = client.models.generate_content_stream(
    model="gemini-2.0-flash", contents=["Tell me about this instrument", organ]
)
for chunk in response:
    print(chunk.text)
    print("_" * 80)text_generation.py

Node.js

// Make sure to include the following import:
// import {GoogleGenAI} from '@google/genai';
const ai = new GoogleGenAI({ apiKey: process.env.GEMINI_API_KEY });

const organ = await ai.files.upload({
  file: path.join(media, "organ.jpg"),
});

const response = await ai.models.generateContentStream({
  model: "gemini-2.0-flash",
  contents: [
    createUserContent([
      "Tell me about this instrument", 
      createPartFromUri(organ.uri, organ.mimeType)
    ]),
  ],
});
let text = "";
for await (const chunk of response) {
  console.log(chunk.text);
  text += chunk.text;
}text_generation.js

Go

ctx := context.Background()
client, err := genai.NewClient(ctx, &genai.ClientConfig{
	APIKey:  os.Getenv("GEMINI_API_KEY"),
	Backend: genai.BackendGeminiAPI,
})
if err != nil {
	log.Fatal(err)
}
file, err := client.Files.UploadFromPath(
	ctx, 
	filepath.Join(getMedia(), "organ.jpg"), 
	&genai.UploadFileConfig{
		MIMEType : "image/jpeg",
	},
)
if err != nil {
	log.Fatal(err)
}
parts := []*genai.Part{
	genai.NewPartFromText("Tell me about this instrument"),
	genai.NewPartFromURI(file.URI, file.MIMEType),
}
contents := []*genai.Content{
	genai.NewContentFromParts(parts, genai.RoleUser),
}
for response, err := range client.Models.GenerateContentStream(
	ctx,
	"gemini-2.0-flash",
	contents,
	nil,
) {
	if err != nil {
		log.Fatal(err)
	}
	fmt.Print(response.Candidates[0].Content.Parts[0].Text)
}text_generation.go

Almeja

cat > "$TEMP_JSON" << EOF
{
  "contents": [{
    "parts":[
      {"text": "Tell me about this instrument"},
      {
        "inline_data": {
          "mime_type":"image/jpeg",
          "data": "$(cat "$TEMP_B64")"
        }
      }
    ]
  }]
}
EOF

curl "https://generativelanguage.googleapis.com/v1beta/models/gemini-2.0-flash:streamGenerateContent?alt=sse&key=$GEMINI_API_KEY" \
    -H 'Content-Type: application/json' \
    -X POST \
    -d "@$TEMP_JSON" 2> /dev/nulltext_generation.sh

Java

Client client = new Client();

String path = media_path + "organ.jpg";
byte[] imageData = Files.readAllBytes(Paths.get(path));

Content content =
        Content.fromParts(
                Part.fromText("Tell me about this instrument."),
                Part.fromBytes(imageData, "image/jpeg"));


ResponseStream<GenerateContentResponse> responseStream =
        client.models.generateContentStream(
                "gemini-2.0-flash",
                content,
                null);

StringBuilder response = new StringBuilder();
for (GenerateContentResponse res : responseStream) {
    System.out.print(res.text());
    response.append(res.text());
}

responseStream.close();TextGeneration.java

Audio

Python

from google import genai

client = genai.Client()
sample_audio = client.files.upload(file=media / "sample.mp3")
response = client.models.generate_content_stream(
    model="gemini-2.0-flash",
    contents=["Give me a summary of this audio file.", sample_audio],
)
for chunk in response:
    print(chunk.text)
    print("_" * 80)text_generation.py

Go

ctx := context.Background()
client, err := genai.NewClient(ctx, &genai.ClientConfig{
	APIKey:  os.Getenv("GEMINI_API_KEY"),
	Backend: genai.BackendGeminiAPI,
})
if err != nil {
	log.Fatal(err)
}

file, err := client.Files.UploadFromPath(
	ctx, 
	filepath.Join(getMedia(), "sample.mp3"), 
	&genai.UploadFileConfig{
		MIMEType : "audio/mpeg",
	},
)
if err != nil {
	log.Fatal(err)
}

parts := []*genai.Part{
	genai.NewPartFromText("Give me a summary of this audio file."),
	genai.NewPartFromURI(file.URI, file.MIMEType),
}

contents := []*genai.Content{
	genai.NewContentFromParts(parts, genai.RoleUser),
}

for result, err := range client.Models.GenerateContentStream(
	ctx,
	"gemini-2.0-flash",
	contents,
	nil,
) {
	if err != nil {
		log.Fatal(err)
	}
	fmt.Print(result.Candidates[0].Content.Parts[0].Text)
}text_generation.go

Almeja

# Use File API to upload audio data to API request.
MIME_TYPE=$(file -b --mime-type "${AUDIO_PATH}")
NUM_BYTES=$(wc -c < "${AUDIO_PATH}")
DISPLAY_NAME=AUDIO

tmp_header_file=upload-header.tmp

# Initial resumable request defining metadata.
# The upload url is in the response headers dump them to a file.
curl "${BASE_URL}/upload/v1beta/files?key=${GEMINI_API_KEY}" \
  -D upload-header.tmp \
  -H "X-Goog-Upload-Protocol: resumable" \
  -H "X-Goog-Upload-Command: start" \
  -H "X-Goog-Upload-Header-Content-Length: ${NUM_BYTES}" \
  -H "X-Goog-Upload-Header-Content-Type: ${MIME_TYPE}" \
  -H "Content-Type: application/json" \
  -d "{'file': {'display_name': '${DISPLAY_NAME}'}}" 2> /dev/null

upload_url=$(grep -i "x-goog-upload-url: " "${tmp_header_file}" | cut -d" " -f2 | tr -d "\r")
rm "${tmp_header_file}"

# Upload the actual bytes.
curl "${upload_url}" \
  -H "Content-Length: ${NUM_BYTES}" \
  -H "X-Goog-Upload-Offset: 0" \
  -H "X-Goog-Upload-Command: upload, finalize" \
  --data-binary "@${AUDIO_PATH}" 2> /dev/null > file_info.json

file_uri=$(jq ".file.uri" file_info.json)
echo file_uri=$file_uri

curl "https://generativelanguage.googleapis.com/v1beta/models/gemini-2.0-flash:streamGenerateContent?alt=sse&key=$GEMINI_API_KEY" \
    -H 'Content-Type: application/json' \
    -X POST \
    -d '{
      "contents": [{
        "parts":[
          {"text": "Please describe this file."},
          {"file_data":{"mime_type": "audio/mpeg", "file_uri": '$file_uri'}}]
        }]
       }' 2> /dev/null > response.json

cat response.json
echotext_generation.sh

Video

Python

from google import genai
import time

client = genai.Client()
# Video clip (CC BY 3.0) from https://peach.blender.org/download/
myfile = client.files.upload(file=media / "Big_Buck_Bunny.mp4")
print(f"{myfile=}")

# Poll until the video file is completely processed (state becomes ACTIVE).
while not myfile.state or myfile.state.name != "ACTIVE":
    print("Processing video...")
    print("File state:", myfile.state)
    time.sleep(5)
    myfile = client.files.get(name=myfile.name)

response = client.models.generate_content_stream(
    model="gemini-2.0-flash", contents=[myfile, "Describe this video clip"]
)
for chunk in response:
    print(chunk.text)
    print("_" * 80)text_generation.py

Node.js

// Make sure to include the following import:
// import {GoogleGenAI} from '@google/genai';
const ai = new GoogleGenAI({ apiKey: process.env.GEMINI_API_KEY });

let video = await ai.files.upload({
  file: path.join(media, 'Big_Buck_Bunny.mp4'),
});

// Poll until the video file is completely processed (state becomes ACTIVE).
while (!video.state || video.state.toString() !== 'ACTIVE') {
  console.log('Processing video...');
  console.log('File state: ', video.state);
  await sleep(5000);
  video = await ai.files.get({name: video.name});
}

const response = await ai.models.generateContentStream({
  model: "gemini-2.0-flash",
  contents: [
    createUserContent([
      "Describe this video clip",
      createPartFromUri(video.uri, video.mimeType),
    ]),
  ],
});
let text = "";
for await (const chunk of response) {
  console.log(chunk.text);
  text += chunk.text;
}text_generation.js

Go

ctx := context.Background()
client, err := genai.NewClient(ctx, &genai.ClientConfig{
	APIKey:  os.Getenv("GEMINI_API_KEY"),
	Backend: genai.BackendGeminiAPI,
})
if err != nil {
	log.Fatal(err)
}

file, err := client.Files.UploadFromPath(
	ctx, 
	filepath.Join(getMedia(), "Big_Buck_Bunny.mp4"), 
	&genai.UploadFileConfig{
		MIMEType : "video/mp4",
	},
)
if err != nil {
	log.Fatal(err)
}

// Poll until the video file is completely processed (state becomes ACTIVE).
for file.State == genai.FileStateUnspecified || file.State != genai.FileStateActive {
	fmt.Println("Processing video...")
	fmt.Println("File state:", file.State)
	time.Sleep(5 * time.Second)

	file, err = client.Files.Get(ctx, file.Name, nil)
	if err != nil {
		log.Fatal(err)
	}
}

parts := []*genai.Part{
	genai.NewPartFromText("Describe this video clip"),
	genai.NewPartFromURI(file.URI, file.MIMEType),
}

contents := []*genai.Content{
	genai.NewContentFromParts(parts, genai.RoleUser),
}

for result, err := range client.Models.GenerateContentStream(
	ctx,
	"gemini-2.0-flash",
	contents,
	nil,
) {
	if err != nil {
		log.Fatal(err)
	}
	fmt.Print(result.Candidates[0].Content.Parts[0].Text)
}text_generation.go

Almeja

# Use File API to upload audio data to API request.
MIME_TYPE=$(file -b --mime-type "${VIDEO_PATH}")
NUM_BYTES=$(wc -c < "${VIDEO_PATH}")
DISPLAY_NAME=VIDEO_PATH

# Initial resumable request defining metadata.
# The upload url is in the response headers dump them to a file.
curl "${BASE_URL}/upload/v1beta/files?key=${GEMINI_API_KEY}" \
  -D upload-header.tmp \
  -H "X-Goog-Upload-Protocol: resumable" \
  -H "X-Goog-Upload-Command: start" \
  -H "X-Goog-Upload-Header-Content-Length: ${NUM_BYTES}" \
  -H "X-Goog-Upload-Header-Content-Type: ${MIME_TYPE}" \
  -H "Content-Type: application/json" \
  -d "{'file': {'display_name': '${DISPLAY_NAME}'}}" 2> /dev/null

upload_url=$(grep -i "x-goog-upload-url: " "${tmp_header_file}" | cut -d" " -f2 | tr -d "\r")
rm "${tmp_header_file}"

# Upload the actual bytes.
curl "${upload_url}" \
  -H "Content-Length: ${NUM_BYTES}" \
  -H "X-Goog-Upload-Offset: 0" \
  -H "X-Goog-Upload-Command: upload, finalize" \
  --data-binary "@${VIDEO_PATH}" 2> /dev/null > file_info.json

file_uri=$(jq ".file.uri" file_info.json)
echo file_uri=$file_uri

state=$(jq ".file.state" file_info.json)
echo state=$state

while [[ "($state)" = *"PROCESSING"* ]];
do
  echo "Processing video..."
  sleep 5
  # Get the file of interest to check state
  curl https://generativelanguage.googleapis.com/v1beta/files/$name > file_info.json
  state=$(jq ".file.state" file_info.json)
done

curl "https://generativelanguage.googleapis.com/v1beta/models/gemini-2.0-flash:streamGenerateContent?alt=sse&key=$GEMINI_API_KEY" \
    -H 'Content-Type: application/json' \
    -X POST \
    -d '{
      "contents": [{
        "parts":[
          {"text": "Please describe this file."},
          {"file_data":{"mime_type": "video/mp4", "file_uri": '$file_uri'}}]
        }]
       }' 2> /dev/null > response.json

cat response.json
echotext_generation.sh

PDF

Python

from google import genai

client = genai.Client()
sample_pdf = client.files.upload(file=media / "test.pdf")
response = client.models.generate_content_stream(
    model="gemini-2.0-flash",
    contents=["Give me a summary of this document:", sample_pdf],
)

for chunk in response:
    print(chunk.text)
    print("_" * 80)text_generation.py

Go

ctx := context.Background()
client, err := genai.NewClient(ctx, &genai.ClientConfig{
	APIKey:  os.Getenv("GEMINI_API_KEY"),
	Backend: genai.BackendGeminiAPI,
})
if err != nil {
	log.Fatal(err)
}

file, err := client.Files.UploadFromPath(
	ctx, 
	filepath.Join(getMedia(), "test.pdf"), 
	&genai.UploadFileConfig{
		MIMEType : "application/pdf",
	},
)
if err != nil {
	log.Fatal(err)
}

parts := []*genai.Part{
	genai.NewPartFromText("Give me a summary of this document:"),
	genai.NewPartFromURI(file.URI, file.MIMEType),
}

contents := []*genai.Content{
	genai.NewContentFromParts(parts, genai.RoleUser),
}

for result, err := range client.Models.GenerateContentStream(
	ctx,
	"gemini-2.0-flash",
	contents,
	nil,
) {
	if err != nil {
		log.Fatal(err)
	}
	fmt.Print(result.Candidates[0].Content.Parts[0].Text)
}text_generation.go

Almeja

MIME_TYPE=$(file -b --mime-type "${PDF_PATH}")
NUM_BYTES=$(wc -c < "${PDF_PATH}")
DISPLAY_NAME=TEXT


echo $MIME_TYPE
tmp_header_file=upload-header.tmp

# Initial resumable request defining metadata.
# The upload url is in the response headers dump them to a file.
curl "${BASE_URL}/upload/v1beta/files?key=${GEMINI_API_KEY}" \
  -D upload-header.tmp \
  -H "X-Goog-Upload-Protocol: resumable" \
  -H "X-Goog-Upload-Command: start" \
  -H "X-Goog-Upload-Header-Content-Length: ${NUM_BYTES}" \
  -H "X-Goog-Upload-Header-Content-Type: ${MIME_TYPE}" \
  -H "Content-Type: application/json" \
  -d "{'file': {'display_name': '${DISPLAY_NAME}'}}" 2> /dev/null

upload_url=$(grep -i "x-goog-upload-url: " "${tmp_header_file}" | cut -d" " -f2 | tr -d "\r")
rm "${tmp_header_file}"

# Upload the actual bytes.
curl "${upload_url}" \
  -H "Content-Length: ${NUM_BYTES}" \
  -H "X-Goog-Upload-Offset: 0" \
  -H "X-Goog-Upload-Command: upload, finalize" \
  --data-binary "@${PDF_PATH}" 2> /dev/null > file_info.json

file_uri=$(jq ".file.uri" file_info.json)
echo file_uri=$file_uri

# Now generate content using that file
curl "https://generativelanguage.googleapis.com/v1beta/models/gemini-2.0-flash:streamGenerateContent?alt=sse&key=$GEMINI_API_KEY" \
    -H 'Content-Type: application/json' \
    -X POST \
    -d '{
      "contents": [{
        "parts":[
          {"text": "Can you add a few more lines to this poem?"},
          {"file_data":{"mime_type": "application/pdf", "file_uri": '$file_uri'}}]
        }]
       }' 2> /dev/null > response.json

cat response.json
echotext_generation.sh

Chat

Python

from google import genai
from google.genai import types

client = genai.Client()
chat = client.chats.create(
    model="gemini-2.0-flash",
    history=[
        types.Content(role="user", parts=[types.Part(text="Hello")]),
        types.Content(
            role="model",
            parts=[
                types.Part(
                    text="Great to meet you. What would you like to know?"
                )
            ],
        ),
    ],
)
response = chat.send_message_stream(message="I have 2 dogs in my house.")
for chunk in response:
    print(chunk.text)
    print("_" * 80)
response = chat.send_message_stream(message="How many paws are in my house?")
for chunk in response:
    print(chunk.text)
    print("_" * 80)

print(chat.get_history())chat.py

Node.js

// Make sure to include the following import:
// import {GoogleGenAI} from '@google/genai';
const ai = new GoogleGenAI({ apiKey: process.env.GEMINI_API_KEY });
const chat = ai.chats.create({
  model: "gemini-2.0-flash",
  history: [
    {
      role: "user",
      parts: [{ text: "Hello" }],
    },
    {
      role: "model",
      parts: [{ text: "Great to meet you. What would you like to know?" }],
    },
  ],
});

console.log("Streaming response for first message:");
const stream1 = await chat.sendMessageStream({
  message: "I have 2 dogs in my house.",
});
for await (const chunk of stream1) {
  console.log(chunk.text);
  console.log("_".repeat(80));
}

console.log("Streaming response for second message:");
const stream2 = await chat.sendMessageStream({
  message: "How many paws are in my house?",
});
for await (const chunk of stream2) {
  console.log(chunk.text);
  console.log("_".repeat(80));
}

console.log(chat.getHistory());chat.js

Go

ctx := context.Background()
client, err := genai.NewClient(ctx, &genai.ClientConfig{
	APIKey:  os.Getenv("GEMINI_API_KEY"),
	Backend: genai.BackendGeminiAPI,
})
if err != nil {
	log.Fatal(err)
}

history := []*genai.Content{
	genai.NewContentFromText("Hello", genai.RoleUser),
	genai.NewContentFromText("Great to meet you. What would you like to know?", genai.RoleModel),
}
chat, err := client.Chats.Create(ctx, "gemini-2.0-flash", nil, history)
if err != nil {
	log.Fatal(err)
}

for chunk, err := range chat.SendMessageStream(ctx, genai.Part{Text: "I have 2 dogs in my house."}) {
	if err != nil {
		log.Fatal(err)
	}
	fmt.Println(chunk.Text())
	fmt.Println(strings.Repeat("_", 64))
}

for chunk, err := range chat.SendMessageStream(ctx, genai.Part{Text: "How many paws are in my house?"}) {
	if err != nil {
		log.Fatal(err)
	}
	fmt.Println(chunk.Text())
	fmt.Println(strings.Repeat("_", 64))
}

fmt.Println(chat.History(false))chat.go

Almeja

curl https://generativelanguage.googleapis.com/v1beta/models/gemini-2.0-flash:streamGenerateContent?alt=sse&key=$GEMINI_API_KEY \
    -H 'Content-Type: application/json' \
    -X POST \
    -d '{
      "contents": [
        {"role":"user",
         "parts":[{
           "text": "Hello"}]},
        {"role": "model",
         "parts":[{
           "text": "Great to meet you. What would you like to know?"}]},
        {"role":"user",
         "parts":[{
           "text": "I have two dogs in my house. How many paws are in my house?"}]},
      ]
    }' 2> /dev/null | grep "text"chat.sh

Cuerpo de la respuesta

Si se ejecuta correctamente, el cuerpo de la respuesta contiene un flujo de instancias de GenerateContentResponse.

Es la respuesta del modelo que admite varias respuestas candidatas.

Las calificaciones de seguridad y el filtrado de contenido se informan para la instrucción en GenerateContentResponse.prompt_feedback y para cada candidato en finishReason y en safetyRatings. La API: - Muestra todos los candidatos solicitados o ninguno de ellos. - No muestra ningún candidato solo si hubo algún problema con la instrucción (consulta promptFeedback). - Informa comentarios sobre cada candidato en finishReason y safetyRatings.

Campos

candidates[] object (Candidate)

Son las respuestas candidatas del modelo.

promptFeedback object (PromptFeedback)

Devuelve los comentarios de la instrucción relacionados con los filtros de contenido.

usageMetadata object (UsageMetadata)

Solo salida. Son metadatos sobre el uso de tokens de las solicitudes de generación.

modelVersion string

Solo salida. Es la versión del modelo que se usó para generar la respuesta.

responseId string

Solo salida. responseId se usa para identificar cada respuesta.

Representación JSON
{ "candidates": [ { object (`Candidate`) } ], "promptFeedback": { object (`PromptFeedback`) }, "usageMetadata": { object (`UsageMetadata`) }, "modelVersion": string, "responseId": string }

PromptFeedback

Es un conjunto de metadatos de comentarios que la instrucción especificó en GenerateContentRequest.content.

Campos

blockReason enum (BlockReason)

Opcional. Si se configura, la instrucción se bloqueó y no se devolvió ningún candidato. Reformula la instrucción.

safetyRatings[] object (SafetyRating)

Son las calificaciones de seguridad de la instrucción. Hay, como máximo, una clasificación por categoría.

Representación JSON
{ "blockReason": enum (`BlockReason`), "safetyRatings": [ { object (`SafetyRating`) } ] }

BlockReason

Especifica el motivo por el que se bloqueó la instrucción.

Enums
`BLOCK_REASON_UNSPECIFIED`	Valor predeterminado Este valor no se usa.
`SAFETY`	Se bloqueó la instrucción por motivos de seguridad. Inspecciona `safetyRatings` para comprender qué categoría de seguridad lo bloqueó.
`OTHER`	Se bloqueó la instrucción por motivos desconocidos.
`BLOCKLIST`	Se bloqueó la instrucción debido a los términos incluidos en la lista de términos bloqueados.
`PROHIBITED_CONTENT`	Se bloqueó la instrucción debido a contenido prohibido.
`IMAGE_SAFETY`	Se bloquearon los candidatos debido a contenido no seguro para la generación de imágenes.

UsageMetadata

Son los metadatos sobre el uso de tokens de la solicitud de generación.

Campos

promptTokenCount integer

Cantidad de tokens en la instrucción. Cuando se establece cachedContent, este sigue siendo el tamaño total efectivo de la instrucción, lo que significa que incluye la cantidad de tokens en el contenido almacenado en caché.

cachedContentTokenCount integer

Cantidad de tokens en la parte almacenada en caché de la instrucción (el contenido almacenado en caché)

candidatesTokenCount integer

Es la cantidad total de tokens en todos los candidatos de respuesta generados.

toolUsePromptTokenCount integer

Solo salida. Cantidad de tokens presentes en las instrucciones de uso de herramientas.

thoughtsTokenCount integer

Solo salida. Cantidad de tokens de pensamientos para los modelos de pensamiento.

totalTokenCount integer

Es el recuento total de tokens para la solicitud de generación (instrucción + candidatos de respuesta).

promptTokensDetails[] object (ModalityTokenCount)

Solo salida. Es la lista de modalidades que se procesaron en la entrada de la solicitud.

cacheTokensDetails[] object (ModalityTokenCount)

Solo salida. Es la lista de modalidades del contenido almacenado en caché en la entrada de la solicitud.

candidatesTokensDetails[] object (ModalityTokenCount)

Solo salida. Es la lista de modalidades que se devolvieron en la respuesta.

toolUsePromptTokensDetails[] object (ModalityTokenCount)

Solo salida. Es la lista de modalidades que se procesaron para las entradas de la solicitud de uso de herramientas.

Representación JSON

Representación JSON
{ "promptTokenCount": integer, "cachedContentTokenCount": integer, "candidatesTokenCount": integer, "toolUsePromptTokenCount": integer, "thoughtsTokenCount": integer, "totalTokenCount": integer, "promptTokensDetails": [ { object (`ModalityTokenCount`) } ], "cacheTokensDetails": [ { object (`ModalityTokenCount`) } ], "candidatesTokensDetails": [ { object (`ModalityTokenCount`) } ], "toolUsePromptTokensDetails": [ { object (`ModalityTokenCount`) } ] }

{
  "promptTokenCount": integer,
  "cachedContentTokenCount": integer,
  "candidatesTokenCount": integer,
  "toolUsePromptTokenCount": integer,
  "thoughtsTokenCount": integer,
  "totalTokenCount": integer,
  "promptTokensDetails": [
    {
      object (ModalityTokenCount)
    }
  ],
  "cacheTokensDetails": [
    {
      object (ModalityTokenCount)
    }
  ],
  "candidatesTokensDetails": [
    {
      object (ModalityTokenCount)
    }
  ],
  "toolUsePromptTokensDetails": [
    {
      object (ModalityTokenCount)
    }
  ]
}

Candidato

Es un candidato de respuesta generado a partir del modelo.

Campos

content object (Content)

Solo salida. Es el contenido generado que devuelve el modelo.

finishReason enum (FinishReason)

Opcional. Solo salida. El motivo por el que el modelo dejó de generar tokens.

Si está vacío, el modelo no dejó de generar tokens.

safetyRatings[] object (SafetyRating)

Es una lista de calificaciones sobre la seguridad de un candidato de respuesta.

Hay, como máximo, una clasificación por categoría.

citationMetadata object (CitationMetadata)

Solo salida. Es la información de la cita del candidato generado por el modelo.

Es posible que este campo se complete con información de recitación para cualquier texto incluido en content. Son pasajes que se "recitan" a partir de material protegido por derechos de autor en los datos de entrenamiento del LLM fundamental.

tokenCount integer

Solo salida. Es el recuento de tokens para este candidato.

groundingAttributions[] object (GroundingAttribution)

Solo salida. Es la información de atribución de las fuentes que contribuyeron a una respuesta fundamentada.

Este campo se propaga para las llamadas a GenerateAnswer.

groundingMetadata object (GroundingMetadata)

Solo salida. Son los metadatos de fundamentación del candidato.

Este campo se propaga para las llamadas a GenerateContent.

avgLogprobs number

Solo salida. Es la puntuación promedio de probabilidad logarítmica del candidato.

logprobsResult object (LogprobsResult)

Solo salida. Puntuaciones de verosimilitud del registro para los tokens de respuesta y los tokens principales

urlContextMetadata object (UrlContextMetadata)

Solo salida. Son los metadatos relacionados con la herramienta de recuperación del contexto de la URL.

index integer

Solo salida. Índice del candidato en la lista de candidatos de respuesta.

finishMessage string

Opcional. Solo salida. Detalla el motivo por el que el modelo dejó de generar tokens. Este campo solo se propaga cuando se establece finishReason.

Representación JSON

Representación JSON
{ "content": { object (`Content`) }, "finishReason": enum (`FinishReason`), "safetyRatings": [ { object (`SafetyRating`) } ], "citationMetadata": { object (`CitationMetadata`) }, "tokenCount": integer, "groundingAttributions": [ { object (`GroundingAttribution`) } ], "groundingMetadata": { object (`GroundingMetadata`) }, "avgLogprobs": number, "logprobsResult": { object (`LogprobsResult`) }, "urlContextMetadata": { object (`UrlContextMetadata`) }, "index": integer, "finishMessage": string }

{
  "content": {
    object (Content)
  },
  "finishReason": enum (FinishReason),
  "safetyRatings": [
    {
      object (SafetyRating)
    }
  ],
  "citationMetadata": {
    object (CitationMetadata)
  },
  "tokenCount": integer,
  "groundingAttributions": [
    {
      object (GroundingAttribution)
    }
  ],
  "groundingMetadata": {
    object (GroundingMetadata)
  },
  "avgLogprobs": number,
  "logprobsResult": {
    object (LogprobsResult)
  },
  "urlContextMetadata": {
    object (UrlContextMetadata)
  },
  "index": integer,
  "finishMessage": string
}

FinishReason

Define el motivo por el que el modelo dejó de generar tokens.

Enums
`FINISH_REASON_UNSPECIFIED`	Valor predeterminado Este valor no se usa.
`STOP`	Punto de detención natural del modelo o secuencia de detención proporcionada.
`MAX_TOKENS`	Se alcanzó la cantidad máxima de tokens especificada en la solicitud.
`SAFETY`	El contenido del candidato de respuesta se marcó por motivos de seguridad.
`RECITATION`	El contenido de la respuesta candidata se marcó por motivos de recitación.
`LANGUAGE`	Se marcó el contenido del candidato a respuesta por usar un idioma no admitido.
`OTHER`	Motivo desconocido.
`BLOCKLIST`	La generación de tokens se detuvo porque el contenido incluye términos prohibidos.
`PROHIBITED_CONTENT`	Se detuvo la generación de tokens porque es posible que contenga contenido prohibido.
`SPII`	Se detuvo la generación de tokens porque es posible que el contenido incluya información de identificación personal sensible (IIPS).
`MALFORMED_FUNCTION_CALL`	La llamada a función que generó el modelo no es válida.
`IMAGE_SAFETY`	Se detuvo la generación de tokens porque las imágenes generadas contienen incumplimientos de seguridad.
`IMAGE_PROHIBITED_CONTENT`	Se detuvo la generación de imágenes porque las imágenes generadas tienen otro contenido prohibido.
`IMAGE_OTHER`	Se detuvo la generación de imágenes debido a otro problema diverso.
`NO_IMAGE`	Se esperaba que el modelo generara una imagen, pero no se generó ninguna.
`IMAGE_RECITATION`	Se detuvo la generación de imágenes debido a la recitación.
`UNEXPECTED_TOOL_CALL`	El modelo generó una llamada a la herramienta, pero no se habilitó ninguna herramienta en la solicitud.
`TOO_MANY_TOOL_CALLS`	El modelo llamó a demasiadas herramientas de forma consecutiva, por lo que el sistema salió de la ejecución.

GroundingAttribution

Es la atribución de una fuente que contribuyó a una respuesta.

Campos

sourceId object (AttributionSourceId)

Solo salida. Es el identificador de la fuente que contribuye a esta atribución.

content object (Content)

Es el contenido de la fuente de fundamentación que compone esta atribución.

Representación JSON
{ "sourceId": { object (`AttributionSourceId`) }, "content": { object (`Content`) } }

AttributionSourceId

Es el identificador de la fuente que contribuye a esta atribución.

Campos

source Union type

source puede ser una de las siguientes opciones:

groundingPassage object (GroundingPassageId)

Es el identificador de un pasaje intercalado.

semanticRetrieverChunk object (SemanticRetrieverChunk)

Es el identificador de un Chunk recuperado a través de Semantic Retriever.

Representación JSON
{ // source "groundingPassage": { object (`GroundingPassageId`) }, "semanticRetrieverChunk": { object (`SemanticRetrieverChunk`) } // Union type }

GroundingPassageId

Es el identificador de una parte dentro de un GroundingPassage.

Campos

passageId string

Solo salida. Es el ID del pasaje que coincide con el GroundingPassage.id del GenerateAnswerRequest.

partIndex integer

Solo salida. Índice de la parte dentro del GroundingPassage.content de GenerateAnswerRequest.

Representación JSON
{ "passageId": string, "partIndex": integer }

SemanticRetrieverChunk

Es el identificador de un Chunk recuperado a través de Semantic Retriever y especificado en GenerateAnswerRequest con SemanticRetrieverConfig.

Campos

source string

Solo salida. Nombre de la fuente que coincide con el SemanticRetrieverConfig.source de la solicitud. Ejemplo: corpora/123 o corpora/123/documents/abc

chunk string

Solo salida. Nombre del Chunk que contiene el texto atribuido. Ejemplo: corpora/123/documents/abc/chunks/xyz

Representación JSON
{ "source": string, "chunk": string }

GroundingMetadata

Son los metadatos que se devuelven al cliente cuando se habilita la fundamentación.

Campos

groundingChunks[] object (GroundingChunk)

Lista de referencias de respaldo recuperadas de la fuente de fundamentación especificada.

groundingSupports[] object (GroundingSupport)

Lista de compatibilidad con la fundamentación.

webSearchQueries[] string

Son las búsquedas web para la búsqueda web de seguimiento.

searchEntryPoint object (SearchEntryPoint)

Opcional. Entrada de la Búsqueda de Google para las búsquedas web de seguimiento.

retrievalMetadata object (RetrievalMetadata)

Son metadatos relacionados con la recuperación en el flujo de fundamentación.

googleMapsWidgetContextToken string

Opcional. Es el nombre del recurso del token de contexto del widget de Google Maps que se puede usar con el widget de PlacesContextElement para renderizar datos contextuales. Solo se propaga en el caso de que se habilite la fundamentación con Google Maps.

Representación JSON

Representación JSON
{ "groundingChunks": [ { object (`GroundingChunk`) } ], "groundingSupports": [ { object (`GroundingSupport`) } ], "webSearchQueries": [ string ], "searchEntryPoint": { object (`SearchEntryPoint`) }, "retrievalMetadata": { object (`RetrievalMetadata`) }, "googleMapsWidgetContextToken": string }

{
  "groundingChunks": [
    {
      object (GroundingChunk)
    }
  ],
  "groundingSupports": [
    {
      object (GroundingSupport)
    }
  ],
  "webSearchQueries": [
    string
  ],
  "searchEntryPoint": {
    object (SearchEntryPoint)
  },
  "retrievalMetadata": {
    object (RetrievalMetadata)
  },
  "googleMapsWidgetContextToken": string
}

SearchEntryPoint

Es el punto de entrada de la Búsqueda de Google.

Campos

renderedContent string

Opcional. Es un fragmento de contenido web que se puede incorporar en una página web o en un WebView de una app.

sdkBlob string (bytes format)

Opcional. Es un JSON codificado en Base64 que representa un array de tuplas <término de búsqueda, URL de búsqueda>.

String codificada en base64.

Representación JSON
{ "renderedContent": string, "sdkBlob": string }

GroundingChunk

Fragmento de fundamentación.

Campos

chunk_type Union type

Tipo de fragmento. chunk_type puede ser una de las siguientes opciones:

web object (Web)

Es un fragmento fundamentado de la Web.

retrievedContext object (RetrievedContext)

Opcional. Fragmento de fundamentación del contexto recuperado por la herramienta de búsqueda de archivos.

maps object (Maps)

Opcional. Es un fragmento fundamentado de Google Maps.

Representación JSON
{ // chunk_type "web": { object (`Web`) }, "retrievedContext": { object (`RetrievedContext`) }, "maps": { object (`Maps`) } // Union type }

Web

Fragmento de la Web.

Campos

uri string

Es la referencia de URI del fragmento.

title string

Es el título del fragmento.

Representación JSON
{ "uri": string, "title": string }

RetrievedContext

Fragmento del contexto recuperado por la herramienta de búsqueda de archivos.

Campos

uri string

Opcional. Es la referencia URI del documento de recuperación semántica.

title string

Opcional. Es el título del documento.

text string

Opcional. Es el texto del fragmento.

Representación JSON
{ "uri": string, "title": string, "text": string, "fileSearchStore": string }

Maps

Es un fragmento de fundamentación de Google Maps. Un fragmento de Maps corresponde a un solo lugar.

Campos

uri string

Es la referencia URI del lugar.

title string

Título del lugar.

text string

Es la descripción de texto de la respuesta del lugar.

placeId string

Es el ID del lugar, en formato places/{placeId}. Un usuario puede usar este ID para buscar ese lugar.

placeAnswerSources object (PlaceAnswerSources)

Son las fuentes que proporcionan respuestas sobre las características de un lugar determinado en Google Maps.

Representación JSON
{ "uri": string, "title": string, "text": string, "placeId": string, "placeAnswerSources": { object (`PlaceAnswerSources`) } }

PlaceAnswerSources

Es una colección de fuentes que proporcionan respuestas sobre las características de un lugar determinado en Google Maps. Cada mensaje de PlaceAnswerSources corresponde a un lugar específico en Google Maps. La herramienta de Google Maps usó estas fuentes para responder preguntas sobre las características del lugar (p. ej., "¿Bar Foo tiene Wi-Fi?" o "¿Foo Bar es accesible para sillas de ruedas?"). Por el momento, solo admitimos fragmentos de opiniones como fuentes.

Campos

reviewSnippets[] object (ReviewSnippet)

Son fragmentos de opiniones que se usan para generar respuestas sobre las características de un lugar determinado en Google Maps.

Representación JSON
{ "reviewSnippets": [ { object (`ReviewSnippet`) } ] }

ReviewSnippet

Encapsula un fragmento de una opinión del usuario que responde una pregunta sobre las características de un lugar específico en Google Maps.

Campos

reviewId string

Es el ID del fragmento de opinión.

googleMapsUri string

Es un vínculo que corresponde a la opinión del usuario en Google Maps.

title string

Es el título de la opinión.

Representación JSON
{ "reviewId": string, "googleMapsUri": string, "title": string }

GroundingSupport

Compatibilidad con la fundamentación.

Campos

groundingChunkIndices[] integer

Es una lista de índices (en "grounding_chunk") que especifican las citas asociadas con la afirmación. Por ejemplo, [1,3, 4] significa que grounding_chunk[1], grounding_chunk[3] y grounding_chunk[4] son el contenido recuperado que se atribuye a la afirmación.

confidenceScores[] number

Es la puntuación de confianza de las referencias de asistencia. El rango varía de 0 a 1. El 1 indica la mayor confianza. Esta lista debe tener el mismo tamaño que groundingChunkIndices.

segment object (Segment)

Es el segmento del contenido al que pertenece esta asistencia.

Representación JSON
{ "groundingChunkIndices": [ integer ], "confidenceScores": [ number ], "segment": { object (`Segment`) } }

Segmentar

Es un segmento del contenido.

Campos

partIndex integer

Solo salida. Índice de un objeto Part dentro de su objeto Content principal.

startIndex integer

Solo salida. Índice de inicio en la parte determinada, medido en bytes. Es el desplazamiento desde el inicio de la parte, incluido, a partir de cero.

endIndex integer

Solo salida. Índice final en la parte determinada, medido en bytes. Es el desplazamiento desde el inicio de la parte, exclusivo, a partir de cero.

text string

Solo salida. Es el texto correspondiente al segmento de la respuesta.

Representación JSON
{ "partIndex": integer, "startIndex": integer, "endIndex": integer, "text": string }

RetrievalMetadata

Son metadatos relacionados con la recuperación en el flujo de fundamentación.

Campos

googleSearchDynamicRetrievalScore number

Opcional. Es una puntuación que indica la probabilidad de que la información de la Búsqueda de Google pueda ayudar a responder la instrucción. La puntuación se encuentra en el rango [0, 1], donde 0 es la probabilidad más baja y 1 es la más alta. Esta puntuación solo se completa cuando se habilitan la fundamentación de la Búsqueda de Google y la recuperación dinámica. Se comparará con el umbral para determinar si se debe activar la Búsqueda de Google.

Representación JSON
{ "googleSearchDynamicRetrievalScore": number }

LogprobsResult

Resultado de Logprobs

Campos

topCandidates[] object (TopCandidates)

La longitud es igual a la cantidad total de pasos de decodificación.

chosenCandidates[] object (Candidate)

La longitud es igual a la cantidad total de pasos de decodificación. Los candidatos elegidos pueden estar o no en topCandidates.

logProbabilitySum number

Es la suma de las probabilidades de registro de todos los tokens.

Representación JSON
{ "topCandidates": [ { object (`TopCandidates`) } ], "chosenCandidates": [ { object (`Candidate`) } ], "logProbabilitySum": number }

TopCandidates

Son los candidatos con las probabilidades de registro más altas en cada paso de decodificación.

Campos

candidates[] object (Candidate)

Se ordenan por probabilidad logarítmica en orden descendente.

Representación JSON
{ "candidates": [ { object (`Candidate`) } ] }

Candidato

Es el candidato para el token y la puntuación de logprobs.

Campos

token string

Es el valor de cadena del token del candidato.

tokenId integer

Es el valor del ID del token del candidato.

logProbability number

Es la probabilidad de registro del candidato.

Representación JSON
{ "token": string, "tokenId": integer, "logProbability": number }

UrlContextMetadata

Son los metadatos relacionados con la herramienta de recuperación del contexto de la URL.

Campos

urlMetadata[] object (UrlMetadata)

Es la lista del contexto de URL.

Representación JSON
{ "urlMetadata": [ { object (`UrlMetadata`) } ] }

UrlMetadata

Es el contexto de la recuperación de una sola URL.

Campos

retrievedUrl string

Es la URL recuperada por la herramienta.

urlRetrievalStatus enum (UrlRetrievalStatus)

Es el estado de la recuperación de la URL.

Representación JSON
{ "retrievedUrl": string, "urlRetrievalStatus": enum (`UrlRetrievalStatus`) }

UrlRetrievalStatus

Es el estado de la recuperación de la URL.

Enums
`URL_RETRIEVAL_STATUS_UNSPECIFIED`	Valor predeterminado Este valor no se usa.
`URL_RETRIEVAL_STATUS_SUCCESS`	Se recuperó la URL correctamente.
`URL_RETRIEVAL_STATUS_ERROR`	No se pudo recuperar la URL debido a un error.
`URL_RETRIEVAL_STATUS_PAYWALL`	No se pudo recuperar la URL porque el contenido está detrás de un muro de pago.
`URL_RETRIEVAL_STATUS_UNSAFE`	No se pudo recuperar la URL porque el contenido no es seguro.

CitationMetadata

Representación JSON
CitationSource
- Representación JSON

Es una colección de atribuciones de fuentes para un fragmento de contenido.

Campos

citationSources[] object (CitationSource)

Son las citas de las fuentes para una respuesta específica.

Representación JSON
{ "citationSources": [ { object (`CitationSource`) } ] }

CitationSource

Es una cita de una fuente para una parte de una respuesta específica.

Campos

startIndex integer

Opcional. Es el inicio del segmento de la respuesta que se atribuye a esta fuente.

El índice indica el inicio del segmento, medido en bytes.

endIndex integer

Opcional. Es el final del segmento atribuido, exclusivo.

uri string

Opcional. Es el URI que se atribuye como fuente de una parte del texto.

license string

Opcional. Es la licencia del proyecto de GitHub que se atribuye como fuente del segmento.

Se requiere información de la licencia para las citas de código.

Representación JSON
{ "startIndex": integer, "endIndex": integer, "uri": string, "license": string }

GenerationConfig

Representación JSON
Modalidad
SpeechConfig
- Representación JSON
VoiceConfig
- Representación JSON
PrebuiltVoiceConfig
- Representación JSON
MultiSpeakerVoiceConfig
- Representación JSON
SpeakerVoiceConfig
- Representación JSON
ThinkingConfig
- Representación JSON
ImageConfig
- Representación JSON
MediaResolution

Son las opciones de configuración para la generación y los resultados del modelo. No todos los parámetros se pueden configurar para todos los modelos.

Campos

stopSequences[] string

Opcional. Es el conjunto de secuencias de caracteres (hasta 5) que detendrán la generación de resultados. Si se especifica, la API se detendrá en la primera aparición de un stop_sequence. La secuencia de detención no se incluirá como parte de la respuesta.

responseMimeType string

Opcional. Tipo de MIME del texto candidato generado. Los tipos de MIME admitidos son los siguientes: text/plain: (predeterminado) Es la salida de texto. application/json: La respuesta JSON en los candidatos de respuesta. text/x.enum: ENUM como respuesta de cadena en los candidatos de respuesta. Consulta los documentos para obtener una lista de todos los tipos de MIME de texto admitidos.

responseSchema object (Schema)

Opcional. Es el esquema de salida del texto candidato generado. Los esquemas deben ser un subconjunto del esquema de OpenAPI y pueden ser objetos, primitivos o arrays.

Si se establece, también se debe establecer un responseMimeType compatible. Tipos de MIME compatibles: application/json: Esquema para la respuesta JSON. Consulta la guía de generación de texto en JSON para obtener más detalles.

_responseJsonSchema value (Value format)

Opcional. Esquema de salida de la respuesta generada. Esta es una alternativa a responseSchema que acepta esquemas JSON.

Si se configura, se debe omitir responseSchema, pero responseMimeType es obligatorio.

Si bien se puede enviar el esquema JSON completo, no se admiten todas las funciones. Específicamente, solo se admiten las siguientes propiedades:

$id
$defs
$ref
$anchor
type
format
title
description
enum (para cadenas y números)
items
prefixItems
minItems
maxItems
minimum
maximum
anyOf
oneOf (se interpreta igual que anyOf)
properties
additionalProperties
required

También se puede establecer la propiedad no estándar propertyOrdering.

Las referencias cíclicas se despliegan hasta un cierto grado y, como tales, solo se pueden usar dentro de propiedades no obligatorias. (Las propiedades que admiten valores nulos no son suficientes). Si $ref se establece en un subesquema, no se pueden establecer otras propiedades, excepto las que comienzan con $.

responseJsonSchema value (Value format)

Opcional. Es un detalle interno. Usa responseJsonSchema en lugar de este campo.

responseModalities[] enum (Modality)

Opcional. Son las modalidades solicitadas de la respuesta. Representa el conjunto de modalidades que el modelo puede devolver y que se deben esperar en la respuesta. Esta es una coincidencia exacta con las modalidades de la respuesta.

Un modelo puede tener varias combinaciones de modalidades admitidas. Si las modalidades solicitadas no coinciden con ninguna de las combinaciones admitidas, se devolverá un error.

Una lista vacía equivale a solicitar solo texto.

candidateCount integer

Opcional. Cantidad de respuestas generadas que se devolverán. Si no se configura, el valor predeterminado será 1. Ten en cuenta que esto no funciona para los modelos de generaciones anteriores (familia de Gemini 1.0).

maxOutputTokens integer

Opcional. Es la cantidad máxima de tokens que se pueden incluir en un candidato de respuesta.

Nota: El valor predeterminado varía según el modelo. Consulta el atributo Model.output_token_limit del objeto Model que se devuelve de la función getModel.

temperature number

Opcional. Controla la aleatoriedad del resultado.

Nota: El valor predeterminado varía según el modelo. Consulta el atributo Model.temperature del objeto Model que se devuelve de la función getModel.

Los valores pueden variar de [0.0, 2.0].

topP number

Opcional. Es la probabilidad acumulativa máxima de los tokens que se deben tener en cuenta durante el muestreo.

El modelo usa un muestreo combinado de Top-k y Top-p (núcleo).

Los tokens se ordenan según las probabilidades asignadas para que solo se tengan en cuenta los tokens más probables. El muestreo Top-K limita directamente la cantidad máxima de tokens que se deben considerar, mientras que el muestreo de núcleo limita la cantidad de tokens según la probabilidad acumulativa.

Nota: El valor predeterminado varía según Model y se especifica con el atributo Model.top_p que devuelve la función getModel. Un atributo topK vacío indica que el modelo no aplica el muestreo top-k y no permite establecer topK en las solicitudes.

topK integer

Opcional. Es la cantidad máxima de tokens que se deben tener en cuenta al realizar el muestreo.

Los modelos de Gemini usan el muestreo Top-p (de núcleo) o una combinación de muestreo Top-k y de núcleo. El muestreo de Top-k considera el conjunto de los tokens más probables de topK. Los modelos que se ejecutan con el muestreo de núcleo no permiten el parámetro de configuración topK.

seed integer

Opcional. Es la semilla que se usa en la decodificación. Si no se establece, la solicitud usa una semilla generada de forma aleatoria.

presencePenalty number

Opcional. Es la penalización de presencia que se aplica a las probabilidades de registro del siguiente token si ya se vio en la respuesta.

Esta penalización es binaria (activada o desactivada) y no depende de la cantidad de veces que se usa el token (después de la primera). Usa frequencyPenalty para una penalización que aumenta con cada uso.

Una penalización positiva desalentará el uso de tokens que ya se usaron en la respuesta, lo que aumentará el vocabulario.

Una penalización negativa fomentará el uso de tokens que ya se usaron en la respuesta, lo que reducirá el vocabulario.

frequencyPenalty number

Opcional. Es la penalización de frecuencia aplicada a las probabilidades logarítmicas del siguiente token, multiplicada por la cantidad de veces que se vio cada token en la respuesta hasta el momento.

Una penalización positiva desalentará el uso de tokens que ya se usaron, de forma proporcional a la cantidad de veces que se usó el token: Cuanto más se use un token, más difícil será para el modelo volver a usarlo, lo que aumentará el vocabulario de las respuestas.

Precaución: Una penalización negativa alentará al modelo a reutilizar tokens de forma proporcional a la cantidad de veces que se usó el token. Los valores negativos pequeños reducirán el vocabulario de una respuesta. Los valores negativos más grandes harán que el modelo comience a repetir un token común hasta que alcance el límite de maxOutputTokens.

responseLogprobs boolean

Opcional. Si es verdadero, exporta los resultados de logprobs en la respuesta.

logprobs integer

Opcional. Solo es válido si responseLogprobs=True. Esto establece la cantidad de probabilidades de registro principales que se devolverán en cada paso de decodificación en Candidate.logprobs_result. El número debe estar en el rango de [0, 20].

enableEnhancedCivicAnswers boolean

Opcional. Habilita las respuestas cívicas mejoradas. Es posible que no esté disponible para todos los modelos.

speechConfig object (SpeechConfig)

Opcional. Es la configuración de generación de voz.

thinkingConfig object (ThinkingConfig)

Opcional. Es la configuración de las funciones de pensamiento. Se mostrará un error si este campo se configura para modelos que no admiten el pensamiento.

imageConfig object (ImageConfig)

Opcional. Es la configuración para la generación de imágenes. Se devolverá un error si este campo se configura para modelos que no admiten estas opciones de configuración.

mediaResolution enum (MediaResolution)

Opcional. Si se especifica, se usará la resolución de medios especificada.

Representación JSON

Representación JSON
{ "stopSequences": [ string ], "responseMimeType": string, "responseSchema": { object (`Schema`) }, "_responseJsonSchema": value, "responseJsonSchema": value, "responseModalities": [ enum (`Modality`) ], "candidateCount": integer, "maxOutputTokens": integer, "temperature": number, "topP": number, "topK": integer, "seed": integer, "presencePenalty": number, "frequencyPenalty": number, "responseLogprobs": boolean, "logprobs": integer, "enableEnhancedCivicAnswers": boolean, "speechConfig": { object (`SpeechConfig`) }, "thinkingConfig": { object (`ThinkingConfig`) }, "imageConfig": { object (`ImageConfig`) }, "mediaResolution": enum (`MediaResolution`) }

{
  "stopSequences": [
    string
  ],
  "responseMimeType": string,
  "responseSchema": {
    object (Schema)
  },
  "_responseJsonSchema": value,
  "responseJsonSchema": value,
  "responseModalities": [
    enum (Modality)
  ],
  "candidateCount": integer,
  "maxOutputTokens": integer,
  "temperature": number,
  "topP": number,
  "topK": integer,
  "seed": integer,
  "presencePenalty": number,
  "frequencyPenalty": number,
  "responseLogprobs": boolean,
  "logprobs": integer,
  "enableEnhancedCivicAnswers": boolean,
  "speechConfig": {
    object (SpeechConfig)
  },
  "thinkingConfig": {
    object (ThinkingConfig)
  },
  "imageConfig": {
    object (ImageConfig)
  },
  "mediaResolution": enum (MediaResolution)
}

Modalidad

Son las modalidades admitidas de la respuesta.

Enums
`MODALITY_UNSPECIFIED`	Valor predeterminado
`TEXT`	Indica que el modelo debe devolver texto.
`IMAGE`	Indica que el modelo debe devolver imágenes.
`AUDIO`	Indica que el modelo debe devolver audio.

SpeechConfig

Es la configuración de generación de voz.

Campos

voiceConfig object (VoiceConfig)

Es la configuración en caso de salida de una sola voz.

multiSpeakerVoiceConfig object (MultiSpeakerVoiceConfig)

Opcional. Es la configuración del sistema de varios altavoces. Se excluye mutuamente con el campo voiceConfig.

languageCode string

Opcional. Es el código de idioma (en formato BCP 47, p.ej., "en-US") para la síntesis de voz.

Los valores válidos son: de-DE, en-AU, en-GB, en-IN, en-US, es-US, fr-FR, hi-IN, pt-BR, ar-XA, es-ES, fr-CA, id-ID, it-IT, ja-JP, tr-TR, vi-VN, bn-IN, gu-IN, kn-IN, ml-IN, mr-IN, ta-IN, te-IN, nl-NL, ko-KR, cmn-CN, pl-PL, ru-RU y th-TH.

Representación JSON
{ "voiceConfig": { object (`VoiceConfig`) }, "multiSpeakerVoiceConfig": { object (`MultiSpeakerVoiceConfig`) }, "languageCode": string }

VoiceConfig

Es la configuración de la voz que se usará.

Campos

voice_config Union type

Es la configuración que usará la bocina. voice_config puede ser una de las siguientes opciones:

prebuiltVoiceConfig object (PrebuiltVoiceConfig)

Es la configuración de la voz prediseñada que se usará.

Representación JSON
{ // voice_config "prebuiltVoiceConfig": { object (`PrebuiltVoiceConfig`) } // Union type }

PrebuiltVoiceConfig

Es la configuración del altavoz prediseñado que se usará.

Campos

voiceName string

Es el nombre de la voz predeterminada que se usará.

Representación JSON
{ "voiceName": string }

MultiSpeakerVoiceConfig

Es la configuración del sistema de varios altavoces.

Campos

speakerVoiceConfigs[] object (SpeakerVoiceConfig)

Obligatorio. Son todas las voces de bocina habilitadas.

Representación JSON
{ "speakerVoiceConfigs": [ { object (`SpeakerVoiceConfig`) } ] }

SpeakerVoiceConfig

Es la configuración de una sola bocina en una configuración de varias bocinas.

Campos

speaker string

Obligatorio. Es el nombre del interlocutor que se usará. Debe ser igual que en la instrucción.

voiceConfig object (VoiceConfig)

Obligatorio. Es la configuración de la voz que se usará.

Representación JSON
{ "speaker": string, "voiceConfig": { object (`VoiceConfig`) } }

ThinkingConfig

Es la configuración de las funciones de pensamiento.

Campos

includeThoughts boolean

Indica si se deben incluir pensamientos en la respuesta. Si es verdadero, los pensamientos solo se devuelven cuando están disponibles.

thinkingBudget integer

Es la cantidad de tokens de pensamientos que debe generar el modelo.

Representación JSON
{ "includeThoughts": boolean, "thinkingBudget": integer }

ImageConfig

Es la configuración de las funciones de generación de imágenes.

Campos

aspectRatio string

Opcional. Es la relación de aspecto de la imagen que se generará. Relaciones de aspecto admitidas: 1:1, 2:3, 3:2, 3:4, 4:3, 9:16, 16:9 y 21:9.

Si no se especifica, el modelo elegirá una relación de aspecto predeterminada en función de las imágenes de referencia proporcionadas.

Representación JSON
{ "aspectRatio": string }

MediaResolution

Resolución de los medios de entrada.

Enums
`MEDIA_RESOLUTION_UNSPECIFIED`	No se estableció la resolución del contenido multimedia.
`MEDIA_RESOLUTION_LOW`	La resolución de los medios se estableció en baja (64 tokens).
`MEDIA_RESOLUTION_MEDIUM`	La resolución de los medios se establece en media (256 tokens).
`MEDIA_RESOLUTION_HIGH`	La resolución de los medios se establece en alta (reencuadre con zoom con 256 tokens).

HarmCategory

Es la categoría de una clasificación.

Estas categorías abarcan varios tipos de daños que los desarrolladores pueden querer ajustar.

Enums
`HARM_CATEGORY_UNSPECIFIED`	La categoría no se especifica.
`HARM_CATEGORY_DEROGATORY`	PaLM: Comentarios negativos o dañinos que se orientan a la identidad o los atributos protegidos.
`HARM_CATEGORY_TOXICITY`	PaLM: Contenido obsceno, grosero o irrespetuoso
`HARM_CATEGORY_VIOLENCE`	PaLM: Describe situaciones que representen violencia contra una persona o un grupo, o descripciones generales de imágenes sangrientas.
`HARM_CATEGORY_SEXUAL`	PaLM: Contiene referencias a actos sexuales o a otro contenido obsceno.
`HARM_CATEGORY_MEDICAL`	PaLM: Promociona consejos médicos no verificados.
`HARM_CATEGORY_DANGEROUS`	PaLM: Contenido peligroso que promueve, facilita o fomenta actividades perjudiciales
`HARM_CATEGORY_HARASSMENT`	Gemini: Contenido de hostigamiento.
`HARM_CATEGORY_HATE_SPEECH`	Gemini: Incitación al odio o a la violencia y contenido
`HARM_CATEGORY_SEXUALLY_EXPLICIT`	Gemini: Contenido sexual explícito
`HARM_CATEGORY_DANGEROUS_CONTENT`	Gemini: Contenido peligroso.
`HARM_CATEGORY_CIVIC_INTEGRITY`	Gemini: Contenido que se puede usar para dañar la integridad cívica. OBSOLETO: En su lugar, usa enableEnhancedCivicAnswers. Este elemento es obsoleto.

ModalityTokenCount

Representación JSON
Modalidad

Representa la información del recuento de tokens para una sola modalidad.

Campos

modality enum (Modality)

Es la modalidad asociada a este recuento de tokens.

tokenCount integer

Cantidad de tokens.

Representación JSON
{ "modality": enum (`Modality`), "tokenCount": integer }

Modalidad

Modalidad de parte de contenido

Enums
`MODALITY_UNSPECIFIED`	Modalidad sin especificar.
`TEXT`	Texto sin formato
`IMAGE`	Imagen.
`VIDEO`	Video.
`AUDIO`	Audio.
`DOCUMENT`	Documento, p.ej., PDF.

SafetyRating

Representación JSON
HarmProbability

Es la clasificación de seguridad de un contenido.

La clasificación de seguridad contiene la categoría de daño y el nivel de probabilidad de daño en esa categoría para un fragmento de contenido. El contenido se clasifica para la seguridad en varias categorías de daño, y aquí se incluye la probabilidad de la clasificación del daño.

Campos

category enum (HarmCategory)

Obligatorio. Es la categoría de esta calificación.

probability enum (HarmProbability)

Obligatorio. Es la probabilidad de daño de este contenido.

blocked boolean

¿Se bloqueó este contenido debido a esta clasificación?

Representación JSON
{ "category": enum (`HarmCategory`), "probability": enum (`HarmProbability`), "blocked": boolean }

HarmProbability

Es la probabilidad de que un elemento de contenido sea dañino.

El sistema de clasificación proporciona la probabilidad de que el contenido no sea seguro. Esto no indica la gravedad del daño que puede causar un contenido.

Enums
`HARM_PROBABILITY_UNSPECIFIED`	No se especifica la probabilidad.
`NEGLIGIBLE`	El contenido tiene una probabilidad insignificante de no ser seguro.
`LOW`	El contenido tiene una probabilidad baja de no ser seguro.
`MEDIUM`	El contenido tiene una probabilidad media de no ser seguro.
`HIGH`	El contenido tiene una alta probabilidad de no ser seguro.

SafetySetting

Representación JSON
HarmBlockThreshold

Es un parámetro de configuración de seguridad que afecta el comportamiento de bloqueo de seguridad.

Si se pasa un parámetro de configuración de seguridad para una categoría, cambia la probabilidad permitida de que se bloquee el contenido.

Campos

category enum (HarmCategory)

Obligatorio. Es la categoría de este parámetro de configuración.

threshold enum (HarmBlockThreshold)

Obligatorio. Controla el umbral de probabilidad en el que se bloquea el daño.

Representación JSON
{ "category": enum (`HarmCategory`), "threshold": enum (`HarmBlockThreshold`) }

HarmBlockThreshold

Bloquear en un umbral de probabilidad de daño especificado y más allá de este

Enums
`HARM_BLOCK_THRESHOLD_UNSPECIFIED`	No se especifica el umbral.
`BLOCK_LOW_AND_ABOVE`	Se permitirá el contenido con NEGLIGIBLE.
`BLOCK_MEDIUM_AND_ABOVE`	Se permitirá el contenido con niveles de negligencia y bajo.
`BLOCK_ONLY_HIGH`	Se permitirá el contenido con niveles de impacto NEGLIGIBLE, LOW y MEDIUM.
`BLOCK_NONE`	Se permitirá todo el contenido.
`OFF`	Desactiva el filtro de seguridad.