The Gemini API supports content generation with images, audio, code, tools, and more. For details on each of these features, read on and check out the task-focused sample code, or read the comprehensive guides.
- Text generation
- Vision
- Audio
- Long context
- Code execution
- JSON Mode
- Function calling
- System instructions
Method: models.generateContent
Generates a model response given an input GenerateContentRequest
. Refer to the text generation guide for detailed usage information. Input capabilities differ between models, including tuned models. Refer to the model guide and tuning guide for details.
Endpoint
posthttps: / /generativelanguage.googleapis.com /v1beta /{model=models /*}:generateContent
The URL uses gRPC Transcoding syntax.
Path parameters
model
string
Required. The name of the Model
to use for generating the completion.
Format: models/{model}
. It takes the form models/{model}
.
Request body
The request body contains data with the following structure:
Optional. A list of Tools
the Model
may use to generate the next response.
A Tool
is a piece of code that enables the system to interact with external systems to perform an action, or set of actions, outside of knowledge and scope of the Model
. Supported Tool
s are Function
and codeExecution
. Refer to the Function calling and the Code execution guides to learn more.
Optional. Tool configuration for any Tool
specified in the request. Refer to the Function calling guide for a usage example.
Optional. A list of unique SafetySetting
instances for blocking unsafe content.
This will be enforced on the GenerateContentRequest.contents
and GenerateContentResponse.candidates
. There should not be more than one setting for each SafetyCategory
type. The API will block any contents and responses that fail to meet the thresholds set by these settings. This list overrides the default settings for each SafetyCategory
specified in the safetySettings. If there is no SafetySetting
for a given SafetyCategory
provided in the list, the API will use the default safety setting for that category. Harm categories HARM_CATEGORY_HATE_SPEECH, HARM_CATEGORY_SEXUALLY_EXPLICIT, HARM_CATEGORY_DANGEROUS_CONTENT, HARM_CATEGORY_HARASSMENT are supported. Refer to the guide for detailed information on available safety settings. Also refer to the Safety guidance to learn how to incorporate safety considerations in your AI applications.
Optional. Developer set system instruction(s). Currently, text only.
Optional. Configuration options for model generation and outputs.
cachedContent
string
Optional. The name of the content cached to use as context to serve the prediction. Format: cachedContents/{cachedContent}
Example request
Text
Python
Node.js
Go
Shell
Kotlin
Swift
Dart
Java
Image
Python
Node.js
Go
Shell
Kotlin
Swift
Dart
Java
Audio
Python
Node.js
Shell
Video
Python
Node.js
Go
Shell
Python
Shell
Chat
Python
Node.js
Go
Shell
Kotlin
Swift
Dart
Java
Cache
Python
Node.js
Tuned Model
Python
JSON Mode
Python
Node.js
Go
Shell
Kotlin
Swift
Dart
Java
Code execution
Python
Kotlin
Java
Function Calling
Python
Node.js
Shell
Kotlin
Swift
Dart
Java
Generation config
Python
Node.js
Go
Shell
Kotlin
Swift
Dart
Java
Safety Settings
Python
Node.js
Go
Shell
Kotlin
Swift
Dart
Java
System Instruction
Python
Node.js
Go
Shell
Kotlin
Swift
Dart
Java
Response body
If successful, the response body contains an instance of GenerateContentResponse
.
Method: models.streamGenerateContent
Generates a streamed response from the model given an input GenerateContentRequest
.
Endpoint
posthttps: / /generativelanguage.googleapis.com /v1beta /{model=models /*}:streamGenerateContent
The URL uses gRPC Transcoding syntax.
Path parameters
model
string
Required. The name of the Model
to use for generating the completion.
Format: models/{model}
. It takes the form models/{model}
.
Request body
The request body contains data with the following structure:
Optional. A list of Tools
the Model
may use to generate the next response.
A Tool
is a piece of code that enables the system to interact with external systems to perform an action, or set of actions, outside of knowledge and scope of the Model
. Supported Tool
s are Function
and codeExecution
. Refer to the Function calling and the Code execution guides to learn more.
Optional. Tool configuration for any Tool
specified in the request. Refer to the Function calling guide for a usage example.
Optional. A list of unique SafetySetting
instances for blocking unsafe content.
This will be enforced on the GenerateContentRequest.contents
and GenerateContentResponse.candidates
. There should not be more than one setting for each SafetyCategory
type. The API will block any contents and responses that fail to meet the thresholds set by these settings. This list overrides the default settings for each SafetyCategory
specified in the safetySettings. If there is no SafetySetting
for a given SafetyCategory
provided in the list, the API will use the default safety setting for that category. Harm categories HARM_CATEGORY_HATE_SPEECH, HARM_CATEGORY_SEXUALLY_EXPLICIT, HARM_CATEGORY_DANGEROUS_CONTENT, HARM_CATEGORY_HARASSMENT are supported. Refer to the guide for detailed information on available safety settings. Also refer to the Safety guidance to learn how to incorporate safety considerations in your AI applications.
Optional. Developer set system instruction(s). Currently, text only.
Optional. Configuration options for model generation and outputs.
cachedContent
string
Optional. The name of the content cached to use as context to serve the prediction. Format: cachedContents/{cachedContent}
Example request
Text
Python
Node.js
Go
Shell
Kotlin
Swift
Dart
Java
Image
Python
Node.js
Go
Shell
Kotlin
Swift
Dart
Java
Audio
Python
Shell
Video
Python
Node.js
Go
Shell
Python
Shell
Chat
Python
Node.js
Go
Shell
Kotlin
Swift
Dart
Java
Response body
If successful, the response body contains a stream of GenerateContentResponse
instances.
GenerateContentResponse
Response from the model supporting multiple candidate responses.
Safety ratings and content filtering are reported for both prompt in GenerateContentResponse.prompt_feedback
and for each candidate in finishReason
and in safetyRatings
. The API: - Returns either all requested candidates or none of them - Returns no candidates at all only if there was something wrong with the prompt (check promptFeedback
) - Reports feedback on each candidate in finishReason
and safetyRatings
.
Candidate responses from the model.
Returns the prompt's feedback related to the content filters.
Output only. Metadata on the generation requests' token usage.
modelVersion
string
Output only. The model version used to generate the response.
JSON representation |
---|
{ "candidates": [ { object ( |
PromptFeedback
A set of the feedback metadata the prompt specified in GenerateContentRequest.content
.
Optional. If set, the prompt was blocked and no candidates are returned. Rephrase the prompt.
Ratings for safety of the prompt. There is at most one rating per category.
JSON representation |
---|
{ "blockReason": enum ( |
BlockReason
Specifies the reason why the prompt was blocked.
Enums | |
---|---|
BLOCK_REASON_UNSPECIFIED |
Default value. This value is unused. |
SAFETY |
Prompt was blocked due to safety reasons. Inspect safetyRatings to understand which safety category blocked it. |
OTHER |
Prompt was blocked due to unknown reasons. |
BLOCKLIST |
Prompt was blocked due to the terms which are included from the terminology blocklist. |
PROHIBITED_CONTENT |
Prompt was blocked due to prohibited content. |
UsageMetadata
Metadata on the generation request's token usage.
promptTokenCount
integer
Number of tokens in the prompt. When cachedContent
is set, this is still the total effective prompt size meaning this includes the number of tokens in the cached content.
cachedContentTokenCount
integer
Number of tokens in the cached part of the prompt (the cached content)
candidatesTokenCount
integer
Total number of tokens across all the generated response candidates.
totalTokenCount
integer
Total token count for the generation request (prompt + response candidates).
JSON representation |
---|
{ "promptTokenCount": integer, "cachedContentTokenCount": integer, "candidatesTokenCount": integer, "totalTokenCount": integer } |
Candidate
- JSON representation
- FinishReason
- GroundingAttribution
- AttributionSourceId
- GroundingPassageId
- SemanticRetrieverChunk
- GroundingMetadata
- SearchEntryPoint
- GroundingChunk
- Web
- GroundingSupport
- Segment
- RetrievalMetadata
- LogprobsResult
- TopCandidates
- Candidate
A response candidate generated from the model.
Output only. Generated content returned from the model.
Optional. Output only. The reason why the model stopped generating tokens.
If empty, the model has not stopped generating tokens.
List of ratings for the safety of a response candidate.
There is at most one rating per category.
Output only. Citation information for model-generated candidate.
This field may be populated with recitation information for any text included in the content
. These are passages that are "recited" from copyrighted material in the foundational LLM's training data.
tokenCount
integer
Output only. Token count for this candidate.
Output only. Attribution information for sources that contributed to a grounded answer.
This field is populated for GenerateAnswer
calls.
Output only. Grounding metadata for the candidate.
This field is populated for GenerateContent
calls.
avgLogprobs
number
Output only. Average log probability score of the candidate.
Output only. Log-likelihood scores for the response tokens and top tokens
index
integer
Output only. Index of the candidate in the list of response candidates.
JSON representation |
---|
{ "content": { object ( |
FinishReason
Defines the reason why the model stopped generating tokens.
Enums | |
---|---|
FINISH_REASON_UNSPECIFIED |
Default value. This value is unused. |
STOP |
Natural stop point of the model or provided stop sequence. |
MAX_TOKENS |
The maximum number of tokens as specified in the request was reached. |
SAFETY |
The response candidate content was flagged for safety reasons. |
RECITATION |
The response candidate content was flagged for recitation reasons. |
LANGUAGE |
The response candidate content was flagged for using an unsupported language. |
OTHER |
Unknown reason. |
BLOCKLIST |
Token generation stopped because the content contains forbidden terms. |
PROHIBITED_CONTENT |
Token generation stopped for potentially containing prohibited content. |
SPII |
Token generation stopped because the content potentially contains Sensitive Personally Identifiable Information (SPII). |
MALFORMED_FUNCTION_CALL |
The function call generated by the model is invalid. |
GroundingAttribution
Attribution for a source that contributed to an answer.
Output only. Identifier for the source contributing to this attribution.
Grounding source content that makes up this attribution.
JSON representation |
---|
{ "sourceId": { object ( |
AttributionSourceId
Identifier for the source contributing to this attribution.
source
Union type
source
can be only one of the following:Identifier for an inline passage.
Identifier for a Chunk
fetched via Semantic Retriever.
JSON representation |
---|
{ // source "groundingPassage": { object ( |
GroundingPassageId
Identifier for a part within a GroundingPassage
.
passageId
string
Output only. ID of the passage matching the GenerateAnswerRequest
's GroundingPassage.id
.
partIndex
integer
Output only. Index of the part within the GenerateAnswerRequest
's GroundingPassage.content
.
JSON representation |
---|
{ "passageId": string, "partIndex": integer } |
SemanticRetrieverChunk
Identifier for a Chunk
retrieved via Semantic Retriever specified in the GenerateAnswerRequest
using SemanticRetrieverConfig
.
source
string
Output only. Name of the source matching the request's SemanticRetrieverConfig.source
. Example: corpora/123
or corpora/123/documents/abc
chunk
string
Output only. Name of the Chunk
containing the attributed text. Example: corpora/123/documents/abc/chunks/xyz
JSON representation |
---|
{ "source": string, "chunk": string } |
GroundingMetadata
Metadata returned to client when grounding is enabled.
List of supporting references retrieved from specified grounding source.
List of grounding support.
webSearchQueries[]
string
Web search queries for the following-up web search.
Optional. Google search entry for the following-up web searches.
Metadata related to retrieval in the grounding flow.
JSON representation |
---|
{ "groundingChunks": [ { object ( |
SearchEntryPoint
Google search entry point.
renderedContent
string
Optional. Web content snippet that can be embedded in a web page or an app webview.
Optional. Base64 encoded JSON representing array of <search term, search url> tuple.
A base64-encoded string.
JSON representation |
---|
{ "renderedContent": string, "sdkBlob": string } |
GroundingChunk
Grounding chunk.
chunk_type
Union type
chunk_type
can be only one of the following:Grounding chunk from the web.
JSON representation |
---|
{
// chunk_type
"web": {
object ( |
Web
Chunk from the web.
uri
string
URI reference of the chunk.
title
string
Title of the chunk.
JSON representation |
---|
{ "uri": string, "title": string } |
GroundingSupport
Grounding support.
groundingChunkIndices[]
integer
A list of indices (into 'grounding_chunk') specifying the citations associated with the claim. For instance [1,3,4] means that grounding_chunk[1], grounding_chunk[3], grounding_chunk[4] are the retrieved content attributed to the claim.
confidenceScores[]
number
Confidence score of the support references. Ranges from 0 to 1. 1 is the most confident. This list must have the same size as the groundingChunkIndices.
Segment of the content this support belongs to.
JSON representation |
---|
{
"groundingChunkIndices": [
integer
],
"confidenceScores": [
number
],
"segment": {
object ( |
Segment
Segment of the content.
partIndex
integer
Output only. The index of a Part object within its parent Content object.
startIndex
integer
Output only. Start index in the given Part, measured in bytes. Offset from the start of the Part, inclusive, starting at zero.
endIndex
integer
Output only. End index in the given Part, measured in bytes. Offset from the start of the Part, exclusive, starting at zero.
text
string
Output only. The text corresponding to the segment from the response.
JSON representation |
---|
{ "partIndex": integer, "startIndex": integer, "endIndex": integer, "text": string } |
RetrievalMetadata
Metadata related to retrieval in the grounding flow.
googleSearchDynamicRetrievalScore
number
Optional. Score indicating how likely information from google search could help answer the prompt. The score is in the range [0, 1], where 0 is the least likely and 1 is the most likely. This score is only populated when google search grounding and dynamic retrieval is enabled. It will be compared to the threshold to determine whether to trigger google search.
JSON representation |
---|
{ "googleSearchDynamicRetrievalScore": number } |
LogprobsResult
Logprobs Result
Length = total number of decoding steps.
Length = total number of decoding steps. The chosen candidates may or may not be in topCandidates.
JSON representation |
---|
{ "topCandidates": [ { object ( |
TopCandidates
Candidates with top log probabilities at each decoding step.
Sorted by log probability in descending order.
JSON representation |
---|
{
"candidates": [
{
object ( |
Candidate
Candidate for the logprobs token and score.
token
string
The candidate’s token string value.
tokenId
integer
The candidate’s token id value.
logProbability
number
The candidate's log probability.
JSON representation |
---|
{ "token": string, "tokenId": integer, "logProbability": number } |
CitationMetadata
A collection of source attributions for a piece of content.
Citations to sources for a specific response.
JSON representation |
---|
{
"citationSources": [
{
object ( |
CitationSource
A citation to a source for a portion of a specific response.
startIndex
integer
Optional. Start of segment of the response that is attributed to this source.
Index indicates the start of the segment, measured in bytes.
endIndex
integer
Optional. End of the attributed segment, exclusive.
uri
string
Optional. URI that is attributed as a source for a portion of the text.
license
string
Optional. License for the GitHub project that is attributed as a source for segment.
License info is required for code citations.
JSON representation |
---|
{ "startIndex": integer, "endIndex": integer, "uri": string, "license": string } |
GenerationConfig
Configuration options for model generation and outputs. Not all parameters are configurable for every model.
stopSequences[]
string
Optional. The set of character sequences (up to 5) that will stop output generation. If specified, the API will stop at the first appearance of a stop_sequence
. The stop sequence will not be included as part of the response.
responseMimeType
string
Optional. MIME type of the generated candidate text. Supported MIME types are: text/plain
: (default) Text output. application/json
: JSON response in the response candidates. text/x.enum
: ENUM as a string response in the response candidates. Refer to the docs for a list of all supported text MIME types.
Optional. Output schema of the generated candidate text. Schemas must be a subset of the OpenAPI schema and can be objects, primitives or arrays.
If set, a compatible responseMimeType
must also be set. Compatible MIME types: application/json
: Schema for JSON response. Refer to the JSON text generation guide for more details.
candidateCount
integer
Optional. Number of generated responses to return.
Currently, this value can only be set to 1. If unset, this will default to 1.
maxOutputTokens
integer
Optional. The maximum number of tokens to include in a response candidate.
Note: The default value varies by model, see the Model.output_token_limit
attribute of the Model
returned from the getModel
function.
temperature
number
Optional. Controls the randomness of the output.
Note: The default value varies by model, see the Model.temperature
attribute of the Model
returned from the getModel
function.
Values can range from [0.0, 2.0].
topP
number
Optional. The maximum cumulative probability of tokens to consider when sampling.
The model uses combined Top-k and Top-p (nucleus) sampling.
Tokens are sorted based on their assigned probabilities so that only the most likely tokens are considered. Top-k sampling directly limits the maximum number of tokens to consider, while Nucleus sampling limits the number of tokens based on the cumulative probability.
Note: The default value varies by Model
and is specified by theModel.top_p
attribute returned from the getModel
function. An empty topK
attribute indicates that the model doesn't apply top-k sampling and doesn't allow setting topK
on requests.
topK
integer
Optional. The maximum number of tokens to consider when sampling.
Gemini models use Top-p (nucleus) sampling or a combination of Top-k and nucleus sampling. Top-k sampling considers the set of topK
most probable tokens. Models running with nucleus sampling don't allow topK setting.
Note: The default value varies by Model
and is specified by theModel.top_p
attribute returned from the getModel
function. An empty topK
attribute indicates that the model doesn't apply top-k sampling and doesn't allow setting topK
on requests.
presencePenalty
number
Optional. Presence penalty applied to the next token's logprobs if the token has already been seen in the response.
This penalty is binary on/off and not dependant on the number of times the token is used (after the first). Use frequencyPenalty
for a penalty that increases with each use.
A positive penalty will discourage the use of tokens that have already been used in the response, increasing the vocabulary.
A negative penalty will encourage the use of tokens that have already been used in the response, decreasing the vocabulary.
frequencyPenalty
number
Optional. Frequency penalty applied to the next token's logprobs, multiplied by the number of times each token has been seen in the respponse so far.
A positive penalty will discourage the use of tokens that have already been used, proportional to the number of times the token has been used: The more a token is used, the more dificult it is for the model to use that token again increasing the vocabulary of responses.
Caution: A negative penalty will encourage the model to reuse tokens proportional to the number of times the token has been used. Small negative values will reduce the vocabulary of a response. Larger negative values will cause the model to start repeating a common token until it hits the maxOutputTokens
limit.
responseLogprobs
boolean
Optional. If true, export the logprobs results in response.
logprobs
integer
Optional. Only valid if responseLogprobs=True
. This sets the number of top logprobs to return at each decoding step in the Candidate.logprobs_result
.
enableEnhancedCivicAnswers
boolean
Optional. Enables enhanced civic answers. It may not be available for all models.
JSON representation |
---|
{
"stopSequences": [
string
],
"responseMimeType": string,
"responseSchema": {
object ( |
HarmCategory
The category of a rating.
These categories cover various kinds of harms that developers may wish to adjust.
Enums | |
---|---|
HARM_CATEGORY_UNSPECIFIED |
Category is unspecified. |
HARM_CATEGORY_DEROGATORY |
PaLM - Negative or harmful comments targeting identity and/or protected attribute. |
HARM_CATEGORY_TOXICITY |
PaLM - Content that is rude, disrespectful, or profane. |
HARM_CATEGORY_VIOLENCE |
PaLM - Describes scenarios depicting violence against an individual or group, or general descriptions of gore. |
HARM_CATEGORY_SEXUAL |
PaLM - Contains references to sexual acts or other lewd content. |
HARM_CATEGORY_MEDICAL |
PaLM - Promotes unchecked medical advice. |
HARM_CATEGORY_DANGEROUS |
PaLM - Dangerous content that promotes, facilitates, or encourages harmful acts. |
HARM_CATEGORY_HARASSMENT |
Gemini - Harassment content. |
HARM_CATEGORY_HATE_SPEECH |
Gemini - Hate speech and content. |
HARM_CATEGORY_SEXUALLY_EXPLICIT |
Gemini - Sexually explicit content. |
HARM_CATEGORY_DANGEROUS_CONTENT |
Gemini - Dangerous content. |
HARM_CATEGORY_CIVIC_INTEGRITY |
Gemini - Content that may be used to harm civic integrity. |
SafetyRating
Safety rating for a piece of content.
The safety rating contains the category of harm and the harm probability level in that category for a piece of content. Content is classified for safety across a number of harm categories and the probability of the harm classification is included here.
Required. The category for this rating.
Required. The probability of harm for this content.
blocked
boolean
Was this content blocked because of this rating?
JSON representation |
---|
{ "category": enum ( |
HarmProbability
The probability that a piece of content is harmful.
The classification system gives the probability of the content being unsafe. This does not indicate the severity of harm for a piece of content.
Enums | |
---|---|
HARM_PROBABILITY_UNSPECIFIED |
Probability is unspecified. |
NEGLIGIBLE |
Content has a negligible chance of being unsafe. |
LOW |
Content has a low chance of being unsafe. |
MEDIUM |
Content has a medium chance of being unsafe. |
HIGH |
Content has a high chance of being unsafe. |
SafetySetting
Safety setting, affecting the safety-blocking behavior.
Passing a safety setting for a category changes the allowed probability that content is blocked.
Required. The category for this setting.
Required. Controls the probability threshold at which harm is blocked.
JSON representation |
---|
{ "category": enum ( |
HarmBlockThreshold
Block at and beyond a specified harm probability.
Enums | |
---|---|
HARM_BLOCK_THRESHOLD_UNSPECIFIED |
Threshold is unspecified. |
BLOCK_LOW_AND_ABOVE |
Content with NEGLIGIBLE will be allowed. |
BLOCK_MEDIUM_AND_ABOVE |
Content with NEGLIGIBLE and LOW will be allowed. |
BLOCK_ONLY_HIGH |
Content with NEGLIGIBLE, LOW, and MEDIUM will be allowed. |
BLOCK_NONE |
All content will be allowed. |
OFF |
Turn off the safety filter. |