On-device generation with Gemma

You can run Gemma models completely on-device with the MediaPipe LLM Inference API. The LLM Inference API acts as a wrapper for large language models, enabling you run Gemma models on-device for common text-to-text generation tasks like information retrieval, email drafting, and document summarization.

Try the LLM Inference API with MediaPipe Studio, a web-based application for evaluating and customizing on-device models.

The LLM Inference API is available on the following platforms:

To learn more, refer to the MediaPipe LLM Inference documentation.