Embeddings

Method: models.embedContent

Generates an embedding from the model given an input Content.

Endpoint

post https://generativelanguage.googleapis.com/v1beta/{model=models/*}:embedContent

Path parameters

model string

Required. The model's resource name. This serves as an ID for the Model to use.

This name should match a model name returned by the models.list method.

Format: models/{model} It takes the form models/{model}.

Request body

The request body contains data with the following structure:

Fields
content object (Content)

Required. The content to embed. Only the parts.text fields will be counted.

taskType enum (TaskType)

Optional. Optional task type for which the embeddings will be used. Can only be set for models/embedding-001.

title string

Optional. An optional title for the text. Only applicable when TaskType is RETRIEVAL_DOCUMENT.

Note: Specifying a title for RETRIEVAL_DOCUMENT provides better quality embeddings for retrieval.

outputDimensionality integer

Optional. Optional reduced dimension for the output embedding. If set, excessive values in the output embedding are truncated from the end. Supported by newer models since 2024, and the earlier model (models/embedding-001) cannot specify this value.

Example request

Python


text = "Hello World!"
result = genai.embed_content(
    model="models/text-embedding-004", content=text, output_dimensionality=10
)
print(result["embedding"])

Node.js

// Make sure to include these imports:
// import { GoogleGenerativeAI } from "@google/generative-ai";
const genAI = new GoogleGenerativeAI(process.env.API_KEY);
const model = genAI.getGenerativeModel({
  model: "text-embedding-004",
});

const result = await model.embedContent("Hello world!");

console.log(result.embedding);

Response body

The response to an EmbedContentRequest.

If successful, the response body contains data with the following structure:

Fields
embedding object (ContentEmbedding)

Output only. The embedding generated from the input content.

JSON representation
{
  "embedding": {
    object (ContentEmbedding)
  }
}

Method: models.batchEmbedContents

Generates multiple embeddings from the model given input text in a synchronous call.

Endpoint

post https://generativelanguage.googleapis.com/v1beta/{model=models/*}:batchEmbedContents

Path parameters

model string

Required. The model's resource name. This serves as an ID for the Model to use.

This name should match a model name returned by the models.list method.

Format: models/{model} It takes the form models/{model}.

Request body

The request body contains data with the following structure:

Fields
requests[] object (EmbedContentRequest)

Required. Embed requests for the batch. The model in each of these requests must match the model specified BatchEmbedContentsRequest.model.

Example request

Python

texts = [
    "What is the meaning of life?",
    "How much wood would a woodchuck chuck?",
    "How does the brain work?",
]
result = genai.embed_content(
    model="models/text-embedding-004", content=texts, output_dimensionality=10
)
print(result)

Node.js

// Make sure to include these imports:
// import { GoogleGenerativeAI } from "@google/generative-ai";
const genAI = new GoogleGenerativeAI(process.env.API_KEY);
const model = genAI.getGenerativeModel({
  model: "text-embedding-004",
});

function textToRequest(text) {
  return { content: { role: "user", parts: [{ text }] } };
}

const result = await model.batchEmbedContents({
  requests: [
    textToRequest("What is the meaning of life?"),
    textToRequest("How much wood would a woodchuck chuck?"),
    textToRequest("How does the brain work?"),
  ],
});

console.log(result.embeddings);

Response body

The response to a BatchEmbedContentsRequest.

If successful, the response body contains data with the following structure:

Fields
embeddings[] object (ContentEmbedding)

Output only. The embeddings for each request, in the same order as provided in the batch request.

JSON representation
{
  "embeddings": [
    {
      object (ContentEmbedding)
    }
  ]
}

EmbedContentRequest

Request containing the Content for the model to embed.

JSON representation
{
  "model": string,
  "content": {
    object (Content)
  },
  "taskType": enum (TaskType),
  "title": string,
  "outputDimensionality": integer
}
Fields
model string

Required. The model's resource name. This serves as an ID for the Model to use.

This name should match a model name returned by the models.list method.

Format: models/{model}

content object (Content)

Required. The content to embed. Only the parts.text fields will be counted.

taskType enum (TaskType)

Optional. Optional task type for which the embeddings will be used. Can only be set for models/embedding-001.

title string

Optional. An optional title for the text. Only applicable when TaskType is RETRIEVAL_DOCUMENT.

Note: Specifying a title for RETRIEVAL_DOCUMENT provides better quality embeddings for retrieval.

outputDimensionality integer

Optional. Optional reduced dimension for the output embedding. If set, excessive values in the output embedding are truncated from the end. Supported by newer models since 2024, and the earlier model (models/embedding-001) cannot specify this value.

ContentEmbedding

A list of floats representing an embedding.

JSON representation
{
  "values": [
    number
  ]
}
Fields
values[] number

The embedding values.

TaskType

Type of task for which the embedding will be used.

Enums
TASK_TYPE_UNSPECIFIED Unset value, which will default to one of the other enum values.
RETRIEVAL_QUERY Specifies the given text is a query in a search/retrieval setting.
RETRIEVAL_DOCUMENT Specifies the given text is a document from the corpus being searched.
SEMANTIC_SIMILARITY Specifies the given text will be used for STS.
CLASSIFICATION Specifies that the given text will be classified.
CLUSTERING Specifies that the embeddings will be used for clustering.
QUESTION_ANSWERING Specifies that the given text will be used for question answering.
FACT_VERIFICATION Specifies that the given text will be used for fact verification.