Generate text using the Gemini API

The Gemini API can generate text output from various types of input, including text, images, video, and audio. You can use text generation for various applications, including:

  • Creative writing
  • Describing or interpreting media assets
  • Text completion
  • Summarizing free-form text
  • Translating between languages
  • Your own novel use cases

This guide shows you how to generate text using the generateContent and streamGenerateContent APIs and the server-side SDK of your choice. The focus is on text output from text-only and text-and-image input. To learn more about multimodal prompting with video and audio files, see Prompting with media files.

What's next

This guide shows how to use generateContent and streamGenerateContent to generate text outputs from text-only and text-and-image inputs. To learn more about generating text using the Gemini API, see the following resources:

  • Prompting with media files: The Gemini API supports prompting with text, image, audio, and video data, also known as multimodal prompting.
  • System instructions: System instructions let you steer the behavior of the model based on your specific needs and use cases.
  • Safety guidance: Sometimes generative AI models produce unexpected outputs, such as outputs that are inaccurate, biased, or offensive. Post-processing and human evaluation are essential to limit the risk of harm from such outputs.