Generate images using Imagen 3

The Gemini API provides access to Imagen 3, Google's highest quality text-to-image model, featuring a number of new and improved capabilities. Imagen 3 can do the following:

  • Generate images with better detail, richer lighting, and fewer distracting artifacts than previous models
  • Understand prompts written in natural
  • Generate images in a wide range of formats and styles
  • Render text more effectively than previous models

To learn more and see example output, see the Google DeepMind Imagen 3 overview.

Before you begin: Set up your project and API key

pip install -U git+https://github.com/google-gemini/generative-ai-python@imagen

Before calling the Gemini API, you need to set up your project and configure your API key.

Generate images

This section shows you how to instantiate an Imagen model and generate images.

To run the example code, you must first install Pillow:

pip install --upgrade Pillow

Then, with Pillow and the Python SDK installed, you can use the following code to generate images:

import os
import google.generativeai as genai

genai.configure(api_key=os.environ['API_KEY'])

imagen = genai.ImageGenerationModel("imagen-3.0-generate-001")

result = imagen.generate_images(
    prompt="Fuzzy bunnies in my kitchen",
    number_of_images=4,
    safety_filter_level="block_only_high",
    person_generation="allow_adult",
    aspect_ratio="3:4",
    negative_prompt="Outside",
)

for image in result.images:
  print(image)

# Open and display the image using your local operating system.
for image in result.images:
  image._pil_image.show()

The notebook should display four images similar to this one:

AI-generated image of two fuzzy bunnies in the kitchen

Imagen model parameters

The following parameters are available for generate_images():

  • prompt: The text prompt for the image.
  • negative_prompt: A description of what you want to omit in the generated images. Defaults to none.

    For example, consider the prompt "a rainy city street at night with no people". The model might interpret "people" as a directive of what to include instead of omit. To generate better results, you could use the prompt "a rainy city street at night" with a negative prompt "people".

  • number_of_images: The number of images to generate, from 1 to 4 (inclusive). The default is 4.

  • aspect_ratio: Changes the aspect ratio of the generated image. Supported values are "1:1", "3:4", "4:3", "9:16", and "16:9". The default is "1:1".

  • safety_filter_level: Adds a filter level to safety filtering. The following values are valid:

    • "block_low_and_above": Block when the probability score or the severity score is LOW, MEDIUM, or HIGH.
    • "block_medium_and_above": Block when the probability score or the severity score is MEDIUM or HIGH.
    • "block_only_high": Block when the probability score or the severity score is HIGH.
  • person_generation: Allow the model to generate images of people. The following values are supported:

    • "dont_allow": Block generation of images of people.
    • "allow_adult": Generate images of adults, but not children.

Text prompt language

The following input text prompt languages are supported:

  • Chinese (simplified) (zh/zh-CN)
  • Chinese (traditional) (zh-TW)
  • English (en)
  • Hindi (hi)
  • Japanese (ja)
  • Korean (ko)
  • Portuguese (pt)
  • Spanish (es)

What's next

Imagen 3 in Gemini API is in early access. Stay tuned for announcements about the status of the feature.