Explore document processing capabilities with the Gemini API
Stay organized with collections
Save and categorize content based on your preferences.
The Gemini API can process and run inference on PDF documents passed to it. When
a PDF is uploaded, the Gemini API can:
Describe or answer questions about the content
Summarize the content
Extrapolate from the content
This tutorial demonstrates some possible ways to prompt the Gemini API with
provided PDF documents. All output is text-only.
What's next
This guide shows how to use
generateContent and
to generate text outputs from processed documents. To learn more,
see the following resources:
File prompting strategies: The
Gemini API supports prompting with text, image, audio, and video data, also
known as multimodal prompting.
System instructions: System
instructions let you steer the behavior of the model based on your specific
needs and use cases.
Safety guidance: Sometimes generative AI
models produce unexpected outputs, such as outputs that are inaccurate,
biased, or offensive. Post-processing and human evaluation are essential to
limit the risk of harm from such outputs.