The models endpoint provides a way for you to programmatically list the available models, and retrieve extended metadata such as supported functionality and context window sizing. Read more in the Models guide.
Method: models.get
Gets information about a specific Model such as its version number, token limits, parameters and other metadata. Refer to the Gemini models guide for detailed model information.
Endpoint
gethttps: / /generativelanguage.googleapis.com /v1beta /{name=models /*}
Path parameters
namestring
Required. The resource name of the model.
This name should match a model name returned by the models.list method.
Format: models/{model} It takes the form models/{model}.
Request body
The request body must be empty.
Example request
Python
Go
Shell
Response body
If successful, the response body contains an instance of Model.
Method: models.list
Lists the Models available through the Gemini API.
Endpoint
gethttps: / /generativelanguage.googleapis.com /v1beta /models
Query parameters
pageSizeinteger
The maximum number of Models to return (per page).
If unspecified, 50 models will be returned per page. This method returns at most 1000 models per page, even if you pass a larger pageSize.
pageTokenstring
A page token, received from a previous models.list call.
Provide the pageToken returned by one request as an argument to the next request to retrieve the next page.
When paginating, all other parameters provided to models.list must match the call that provided the page token.
Request body
The request body must be empty.
Example request
Python
Go
Shell
Response body
Response from ListModel containing a paginated list of Models.
If successful, the response body contains data with the following structure:
The returned Models.
nextPageTokenstring
A token, which can be sent as pageToken to retrieve the next page.
If this field is omitted, there are no more pages.
| JSON representation |
|---|
{
"models": [
{
object ( |
REST Resource: models
Resource: Model
Information about a Generative Language Model.
namestring
Required. The resource name of the Model. Refer to Model variants for all allowed values.
Format: models/{model} with a {model} naming convention of:
- "{baseModelId}-{version}"
Examples:
models/gemini-1.5-flash-001
baseModelIdstring
Required. The name of the base model, pass this to the generation request.
Examples:
gemini-1.5-flash
versionstring
Required. The version number of the model.
This represents the major version (1.0 or 1.5)
displayNamestring
The human-readable name of the model. E.g. "Gemini 1.5 Flash".
The name can be up to 128 characters long and can consist of any UTF-8 characters.
descriptionstring
A short description of the model.
inputTokenLimitinteger
Maximum number of input tokens allowed for this model.
outputTokenLimitinteger
Maximum number of output tokens available for this model.
supportedGenerationMethods[]string
The model's supported generation methods.
The corresponding API method names are defined as Pascal case strings, such as generateMessage and generateContent.
thinkingboolean
Whether the model supports thinking.
temperaturenumber
Controls the randomness of the output.
Values can range over [0.0,maxTemperature], inclusive. A higher value will produce responses that are more varied, while a value closer to 0.0 will typically result in less surprising responses from the model. This value specifies default to be used by the backend while making the call to the model.
maxTemperaturenumber
The maximum temperature this model can use.
topPnumber
For Nucleus sampling.
Nucleus sampling considers the smallest set of tokens whose probability sum is at least topP. This value specifies default to be used by the backend while making the call to the model.
topKinteger
For Top-k sampling.
Top-k sampling considers the set of topK most probable tokens. This value specifies default to be used by the backend while making the call to the model. If empty, indicates the model doesn't use top-k sampling, and topK isn't allowed as a generation parameter.
| JSON representation |
|---|
{ "name": string, "baseModelId": string, "version": string, "displayName": string, "description": string, "inputTokenLimit": integer, "outputTokenLimit": integer, "supportedGenerationMethods": [ string ], "thinking": boolean, "temperature": number, "maxTemperature": number, "topP": number, "topK": integer } |
Method: models.predict
Performs a prediction request.
Endpoint
posthttps: / /generativelanguage.googleapis.com /v1beta /{model=models /*}:predict
Path parameters
modelstring
Required. The name of the model for prediction. Format: name=models/{model}. It takes the form models/{model}.
Request body
The request body contains data with the following structure:
Required. The instances that are the input to the prediction call.
Optional. The parameters that govern the prediction call.
Response body
Response message for [PredictionService.Predict].
If successful, the response body contains data with the following structure:
The outputs of the prediction call.
| JSON representation |
|---|
{ "predictions": [ value ] } |
Method: models.predictLongRunning
Same as models.predict but returns an LRO.
Endpoint
posthttps: / /generativelanguage.googleapis.com /v1beta /{model=models /*}:predictLongRunning
Path parameters
modelstring
Required. The name of the model for prediction. Format: name=models/{model}.
Request body
The request body contains data with the following structure:
Required. The instances that are the input to the prediction call.
Optional. The parameters that govern the prediction call.
Response body
If successful, the response body contains an instance of Operation.