Priced to help you bring your app to the world
Preview
Preview
Available now
Our fastest multimodal model with exceptional speed and efficiency for quick, high-frequency tasks. Currently available in preview.
Free of charge*
Rate Limits**
15 RPM (requests per minute)
1 million TPM (tokens per minute)
1,500 RPD (requests per day)
Price (input)
Free of charge
Context caching - coming soon
Not applicable
Price (output)
Free of charge
Prompts/responses used to improve our products
Yes
Pay-as-you-go (prices in USD)***
Rate Limits**
360 RPM (requests per minute)
10 million TPM (tokens per minute)
10,000 RPD (requests per day)
Price (input)
$0.35 / 1 million tokens (for prompts up to 128K tokens)
$0.70 / 1 million tokens (for prompts longer than 128K)
Context caching - coming soon
Not applicable
Price (output)
$1.05 / 1 million tokens (for prompts up to 128K tokens)
$2.10 / 1 million tokens (for prompts longer than 128K)
Prompts/responses used to improve our products
No
Our high-performing multimodal model for complex tasks requiring deep reasoning and nuanced understanding. Currently available in preview.
Free of charge*
Rate Limits**
2 RPM (requests per minute)
32,000 TPM (tokens per minute)
50 RPD (requests per day)
Price (input)
Free of charge
Context caching - coming soon
Not applicable
Price (output)
Free of charge
Prompts/responses used to improve our products
Yes
Pay-as-you-go (prices in USD)***
Rate Limits**
360 RPM (requests per minute)
10 million TPM (tokens per minute)
10,000 RPD (requests per day)
Price (input)
$3.50 / 1 million tokens (for prompts up to 128K tokens)
$7.00 / 1 million tokens (for prompts longer than 128K)
Context caching - coming soon
$1.75 / 1 million tokens (for prompts up to 128K tokens)
$3.50 / 1 million tokens (for prompts longer than 128K)
$4.50 / 1 million tokens per hour (storage)
Price (output)
$10.50 / 1 million tokens (for prompts up to 128K tokens)
$21.00 / 1 million tokens (for prompts longer than 128K)
Prompts/responses used to improve our products
No
Our first-generation model offering only text and image reasoning. Generally available for production use.
Free of charge
Rate Limits**
15 RPM (requests per minute)
32,000 TPM (tokens per minute)
1,500 RPD (requests per day)
Price (input)
Free of charge
Context caching - coming soon
Not applicable
Price (output)
Free of charge
Prompts/responses used to improve our products
Yes
Pay-as-you-go (prices in USD)
Rate Limits**
360 RPM (requests per minute)
120,000 TPM (tokens per minute)
30,000 RPD (requests per day)
Price (input)
$0.50 / 1 million tokens**
Context caching - coming soon
Not available
Price (output)
$1.50 / 1 million tokens**
Prompts/responses used to improve our products
No
*Free tier is not available in EEA (including EU), UK and CH.
**Specified rate limits are not guaranteed and actual capacity may vary. Apply for an increased maximum rate limit (for paid tier only).
***Tuned model inference costs are billed at the same price as the base models.
Build with Vertex AI on Google Cloud