Gemini API Overview

The Gemini API gives you access to the latest generative models from Google. Once you're familiar with the general features available to you through the API, try a quickstart for your language of choice to start developing.

Models

Gemini is a series of multimodal generative AI models developed by Google. Gemini models can accept text and image in prompts, depending on what model variation you choose, and output text responses. The legacy PaLM models accept text-only and output text responses.

To get more detailed model information refer to the models page. You can also use the list_models method to list all the models available and then the get_model method to get the metadata for a particular model.

Prompt data and design

Specific Gemini models accept both images and text data as input. This capability creates many additional possibilities for generating content, analyzing data, and solving problems. There are some limitations and requirements to consider, including the general input token limit for the model you are using. For information on the token limits for specific models, see Gemini models.

Image requirements for prompts

Prompts that use image data are subject to the following limitations and requirements:

  • Images must be in one of the following image data MIME types:
    • PNG - image/png
    • JPEG - image/jpeg
    • WEBP - image/webp
    • HEIC - image/heic
    • HEIF - image/heif
  • Maximum of 16 individual images
  • Maximum of 4MB for the entire prompt, including images and text
  • No specific limits to the number of pixels in an image; however, larger images are scaled down to fit a maximum resolution of 3072 x 3072 while preserving their original aspect ratio.

When using images in your prompt, follow these recommendations for best results:

  • Prompts with a single image tend to yield better results.

Prompt design and text input

Creating effective prompts, or prompt engineering, is a combination of art and science. See the prompt guidelines for guidance on how to approach prompting and the prompt 101 guide to learn about different approaches to prompting.

Generate content

The Gemini API lets you use both text and image data for prompting, depending on what model variation you use. For example, you can generate text using text prompts with the gemini-pro model and use both text and image data to prompt the gemini-pro-vision model. This section gives simple code examples of each. Refer to the generateContent API reference for a more detailed example that covers all of the parameters.

Text and image input

You can send a text prompt with an image to the gemini-pro-vision model to perform a vision related task. For example, captioning an image or identifying what's in an image.

The following code examples demonstrate a simple implementation of a text and image prompt for each supported language:

Python

model = genai.GenerativeModel('gemini-pro-vision')

cookie_picture = [{
    'mime_type': 'image/png',
    'data': Path('cookie.png').read_bytes()
}]
prompt = "Do these look store-bought or homemade?"

response = model.generate_content(
    model="gemini-pro-vision",
    content=[prompt, cookie_picture]
)
print(response.text)

See the Python quickstart to see the complete code snippet.

Go

vmodel := client.GenerativeModel("gemini-pro-vision")

data, err := os.ReadFile(filepath.Join("path-to-image", imageFile))
if err != nil {
  log.Fatal(err)
}
resp, err := vmodel.GenerateContent(ctx, genai.Text("Do these look store-bought or homemade?"), genai.ImageData("jpeg", data))
if err != nil {
  log.Fatal(err)
}

See the Go quickstart for a full example.

Node.js

const model = genAI.getGenerativeModel({ model: "gemini-pro-vision" });

const prompt = "Do these look store-bought or homemade?";
const image = {
  inlineData: {
    data: Buffer.from(fs.readFileSync("cookie.png")).toString("base64"),
    mimeType: "image/png",
  },
};

const result = await model.generateContent([prompt, image]);
console.log(result.response.text());

See the Node.js quickstart for a full example.

Web

const model = genAI.getGenerativeModel({ model: "gemini-pro-vision" });

const prompt = "Do these look store-bought or homemade?";
const image = {
  inlineData: {
    data: base64EncodedImage /* see JavaScript quickstart for details */,
    mimeType: "image/png",
  },
};

const result = await model.generateContent([prompt, image]);
console.log(result.response.text());

See the JavaScript quickstart for a full example.

Dart (Flutter)

final model = GenerativeModel(model: 'gemini-pro-vision', apiKey: apiKey);
final prompt = 'Do these look store-bought or homemade?';
final imageBytes = await File('cookie.png').readAsBytes();
final content = [
  Content.multi([
    TextPart(prompt),
    DataPart('image/png', imageBytes),
  ])
];

final response = await model.generateContent(content);
print(response.text);

See the Dart (Flutter) quickstart for a full example.

Swift

let model = GenerativeModel(name: "gemini-pro-vision", apiKey: "API_KEY")
let cookieImage = UIImage(...)
let prompt = "Do these look store-bought or homemade?"

let response = try await model.generateContent(prompt, cookieImage)

See the Swift quickstart for a full example.

Android

val generativeModel = GenerativeModel(
    modelName = "gemini-pro-vision",
    apiKey = BuildConfig.apiKey
)

val cookieImage: Bitmap = // ...
val inputContent = content() {
  image(cookieImage)
  text("Do these look store-bought or homemade?")
}

val response = generativeModel.generateContent(inputContent)
print(response.text)

See the Android quickstart for a full example.

cURL

curl https://generativelanguage.googleapis.com/v1/models/gemini-pro-vision:GenerateContent?key=${API_KEY} \
    -H 'Content-Type: application/json' \
    -d @<(echo'{
          "contents":[
            { "parts":[
                {"text": "Do these look store-bought or homemade?"},
                { "inlineData": {
                    "mimeType": "image/png",
                    "data": "'$(base64 -w0 cookie.png)'"
                  }
                }
              ]
            }
          ]
         }')

See the REST API quickstart for more details.

Text only input

The Gemini API can also handle text-only input. This feature lets you perform natural language processing (NLP) tasks such as text completion and summarization.

The following code examples demonstrate a simple implementation of a text-only prompt for each supported language:

Python

model = genai.GenerativeModel('gemini-pro-vision')

prompt = "Write a story about a magic backpack."

response = model.generate_content(prompt)

See the Python quickstart for the full example.

Go

ctx := context.Background()
client, err := genai.NewClient(ctx, option.WithAPIKey(os.Getenv("API_KEY")))
if err != nil {
  log.Fatal(err)
}
defer client.Close()

model := client.GenerativeModel("gemini-pro")
resp, err := model.GenerateContent(ctx, genai.Text("Write a story about a magic backpack."))
if err != nil {
  log.Fatal(err)
}

See the Go quickstart for a full example.

Node.js

const model = genAI.getGenerativeModel({ model: "gemini-pro" });
const prompt = "Write a story about a magic backpack.";

const result = await model.generateContent(prompt);
console.log(result.response.text());

See the Node.js quickstart for a full example.

Web

const model = genAI.getGenerativeModel({ model: "gemini-pro" });
const prompt = "Write a story about a magic backpack.";

const result = await model.generateContent(prompt);
console.log(result.response.text());

See the JavaScript quickstart for a full example.

Dart (Flutter)

final model = GenerativeModel(model: 'gemini-pro', apiKey: apiKey);
final prompt = 'Write a story about a magic backpack.';
final content = [Content.text(prompt)];
final response = await model.generateContent(content);
print(response.text);

See the Dart (Flutter) quickstart for a full example.

Swift

let model = GenerativeModel(name: "gemini-pro", apiKey: "API_KEY")
let prompt = "Write a story about a magic backpack."

let response = try await model.generateContent(prompt)

See the quickstart for Swift for a full example.

Android

val generativeModel = GenerativeModel(
    modelName = "gemini-pro",
    apiKey = BuildConfig.apiKey
)

val prompt = "Write a story about a magic backpack."
val response = generativeModel.generateContent(prompt)
print(response.text)

See the Android quickstart for a full example.

cURL

curl https://generativelanguage.googleapis.com/v1/models/gemini-pro:generateContent?key=$API_KEY \
    -H 'Content-Type: application/json' \
    -X POST \
    -d '{ "contents":[
      { "parts":[{"text": "Write a story about a magic backpack"}]}
    ]
}'

See the REST API quickstart for more details.

Multi-turn conversations (chat)

You can use the Gemini API to build interactive chat experiences for your users. Using the chat feature of the API lets you collect multiple rounds of questions and responses, allowing users to incrementally step toward answers or get help with multi-part problems. This feature is ideal for applications that require ongoing communication, such as chatbots, interactive tutors, or customer support assistants.

The following code examples demonstrate a simple implementation of chat interaction for each supported language:

Python

  model = genai.GenerativeModel('gemini-pro')
  chat = model.start_chat(history=[])

  response = chat.send_message(
      "Pretend you\'re a snowman and stay in character for each response.")
  print(response.text)

  response = chat.send_message(
      "What\'s your favorite season of the year?")
  print(response.text)

See the chat demo in the Python quickstart for a full example.

Go

model := client.GenerativeModel("gemini-pro")
cs := model.StartChat()
cs.History = []*genai.Content{
  &genai.Content{
    Parts: []genai.Part{
      genai.Text("Pretend you're a snowman and stay in character for each response."),
    },
    Role: "user",
  },
  &genai.Content{
    Parts: []genai.Part{
      genai.Text("Hello! It's cold! Isn't that great?"),
    },
    Role: "model",
  },
}

resp, err := cs.SendMessage(ctx, genai.Text("What's your favorite season of the year?"))
if err != nil {
  log.Fatal(err)
}

See the chat demo in the Go quickstart for a full example.

Node.js

const model = genAI.getGenerativeModel({ model: "gemini-pro"});

const chat = model.startChat({
  history: [
    {
      role: "user",
      parts: "Pretend you're a snowman and stay in character for each response.",
    },
    {
      role: "model",
      parts: "Hello! It's cold! Isn't that great?",
    },
  ],
  generationConfig: {
    maxOutputTokens: 100,
  },
});

const msg = "What's your favorite season of the year?";
const result = await chat.sendMessage(msg);
console.log(result.response.text());

See the chat demo in the Node.js quickstart for a full example.

Web

const model = genAI.getGenerativeModel({ model: "gemini-pro"});

const chat = model.startChat({
  history: [
    {
      role: "user",
      parts: "Pretend you're a snowman and stay in character for each response.",
    },
    {
      role: "model",
      parts: "Hello! It's so cold! Isn't that great?",
    },
  ],
  generationConfig: {
    maxOutputTokens: 100,
  },
});

const msg = "What's your favorite season of the year?";
const result = await chat.sendMessage(msg);
console.log(result.response.text());

See the chat demo in the JavaScript quickstart for a full example.

Dart (Flutter)

final model = GenerativeModel(model: 'gemini-pro', apiKey: apiKey);
final chat = model.startChat(history: [
  Content.text(
      "Pretend you're a snowman and stay in character for each response."),
  Content.model([TextPart("Hello! It's cold! Isn't that great?")]),
]);
final content = Content.text("What's your favorite season of the year?");
final response = await chat.sendMessage(content);
print(response.text);

See the chat demo in the Dart (Flutter) quickstart for a full example.

Swift

let model = GenerativeModel(name: "gemini-pro", apiKey: "API_KEY")
let chat = model.startChat()

var message = "Pretend you're a snowman and stay in character for each response."
var response = try await chat.sendMessage(message)

message = "What\'s your favorite season of the year?"
response = try await chat.sendMessage(message)

See the chat demo in the Swift quickstart for a full example.

Android

val generativeModel = GenerativeModel(
    modelName = "gemini-pro",
    apiKey = BuildConfig.apiKey
)

val chat = generativeModel.startChat()
val response = chat.sendMessage("Pretend you're a snowman and stay in
        character for each response.")
print(response.text)

See the Android quickstart for a full example.

cURL

curl https://generativelanguage.googleapis.com/v1beta/models/gemini-pro:generateContent?key=$API_KEY \
    -H 'Content-Type: application/json' \
    -X POST \
    -d '{
      "contents": [
        {"role":"user",
         "parts":[{
           "text": "Pretend you're a snowman and stay in character for each
        {"role": "model",
            response."}]},
         "parts":[{
           "text": "Hello! It's so cold! Isn't that great?"}]},
        {"role": "user",
         "parts":[{
           "text": "What\'s your favorite season of the year?"}]},
       ]
    }' 2> /dev/null | grep "text"
# response example:
"text": "Winter, of course!"

See the REST API quickstart for more details.

Streamed responses

The Gemini API provides an additional way to receive responses from generative AI models: as a data stream. A streamed response sends incremental pieces of data back to your application as it is generated by the model. This feature lets you respond quickly to a user request to show progress and create a more interactive experience.

Streamed responses are an option for freeform prompting and chats with Gemini models. The following code examples show how to request a streamed response for a prompt for each supported language:

Python

prompt = "Write a story about a magic backpack."

response = genai.stream_generate_content(
    model="models/gemini-pro",
    prompt=prompt
)

See the Python quickstart to see the complete code snippet.

Go

ctx := context.Background()
client, err := genai.NewClient(ctx, option.WithAPIKey(os.Getenv("API_KEY")))
if err != nil {
  log.Fatal(err)
}
defer client.Close()

model := client.GenerativeModel("gemini-pro")

iter := model.GenerateContentStream(ctx, genai.Text("Write a story about a magic backpack."))
for {
  resp, err := iter.Next()
  if err == iterator.Done {
    break
  }
  if err != nil {
    log.Fatal(err)
  }

  // print resp
}

See the Go quickstart for a full example.

Node.js

const model = genAI.getGenerativeModel({ model: "gemini-pro" });
const prompt = "Write a story about a magic backpack.";

const result = await model.generateContentStream([prompt]);
// print text as it comes in
for await (const chunk of result.stream) {
  const chunkText = chunk.text();
  console.log(chunkText);
}

See the Node.js quickstart for a full example.

Web

const model = genAI.getGenerativeModel({ model: "gemini-pro" });
const prompt = "Write a story about a magic backpack.";

const result = await model.generateContentStream([prompt]);
// print text as it comes in
for await (const chunk of result.stream) {
  const chunkText = chunk.text();
  console.log(chunkText);
}

See the JavaScript quickstart for a full example.

Dart (Flutter)

final model = GenerativeModel(model: 'gemini-pro', apiKey: apiKey);
final prompt = 'Write a story about a magic backpack.';
final content = [Content.text(prompt)];
final response = model.generateContentStream(content);
await for (final chunk in response) {
  print(chunk.text);
}

See the Dart (Flutter) quickstart for a full example.

Swift

let model = GenerativeModel(name: "gemini-pro", apiKey: "API_KEY")
let prompt = "Write a story about a magic backpack."

let stream = model.generateContentStream(prompt)
for try await chunk in stream {
  print(chunk.text ?? "No content")
}

See the Swift quickstart for a full example.

Android

val generativeModel = GenerativeModel(
    modelName = "gemini-pro",
    apiKey = BuildConfig.apiKey
)

val inputContent = content {
  text("Write a story about a magic backpack.")
}

var fullResponse = ""
generativeModel.generateContentStream(inputContent).collect { chunk ->
  print(chunk.text)
  fullResponse += chunk.text
}

See the Android quickstart for a full example.

cURL

curl https://generativelanguage.googleapis.com/v1/models/gemini-pro:streamGenerateContent?key=${API_KEY} \
    -H 'Content-Type: application/json' \
    --no-buffer \
    -d '{ "contents":[
            {"role": "user",
              "parts":[{"text": "Write a story about a magic backpack."}]
            }
          ]
        }' > response.json

See the REST API quickstart for more details.

Embeddings

The embedding service in the Gemini API generates state-of-the-art embeddings for words, phrases, and sentences. The resulting embeddings can then be used for NLP tasks, such as semantic search, text classification, and clustering, among many others. See the embeddings guide to learn what embeddings are and some key use cases for the embedding service to help you get started.

Next steps