The Gemini API can process and run inference on PDF documents passed to it. When a PDF is uploaded, the Gemini API can:
- Describe or answer questions about the content
- Summarize the content
- Extrapolate from the content
This tutorial demonstrates some possible ways to prompt the Gemini API with provided PDF documents. All output is text-only.
What's next
This guide shows how to use
generateContent
and
to generate text outputs from processed documents. To learn more,
see the following resources:
- File prompting strategies: The Gemini API supports prompting with text, image, audio, and video data, also known as multimodal prompting.
- System instructions: System instructions let you steer the behavior of the model based on your specific needs and use cases.
- Safety guidance: Sometimes generative AI models produce unexpected outputs, such as outputs that are inaccurate, biased, or offensive. Post-processing and human evaluation are essential to limit the risk of harm from such outputs.