Gemma models overview

Gemma is a family of lightweight, state-of-the-art open models built from the same research and technology used to create the Gemini models. Developed by Google DeepMind and other teams across Google, Gemma is named after the Latin gemma, meaning "precious stone." The Gemma model weights are supported by developer tools that promote innovation, collaboration, and the responsible use of artificial intelligence (AI).

The Gemma models are available to run in your applications and on your hardware, mobile devices, or hosted services. You can also customize these models using tuning techniques so that they excel at performing tasks that matter to you and your users. Gemma models draw inspiration and technological lineage from the Gemini family of models, and are made for the AI development community to extend and take further.

You can use Gemma models for text generation, however you can also tune these models to specialize in performing specific tasks. Tuned Gemma models can provide you and your users with more targeted and efficient generative AI solutions. Check out our guide on tuning with LoRA and try it out! We are excited to see what you build with Gemma!

This developer documentation provides an overview of the available Gemma models and development guides for how to apply them and tune them for specific applications.

Model sizes and capabilities

Gemma models are available in several sizes so you can build generative AI solutions based on your available computing resources, the capabilities you need, and where you want to run them. If you are not sure where to start, try the 2B parameter size for the lower resource requirements and more flexibility in where you deploy the model.

Parameters size Input Output Tuned versions Intended platforms
2B Text Text
  • Pretrained
  • Instruction tuned
Mobile devices and laptops
7B Text Text
  • Pretrained
  • Instruction tuned
Desktop computers and small servers

Using the Keras 3.0 multi-backed feature, you can run these models on TensorFlow, JAX, and PyTorch, or use the built-in implementation with JAX (based on the FLAX framework) and PyTorch.

You can download the Gemma models from Kaggle Models or deploy them on Vertex AI.

Tuned models

You can modify the behavior of Gemma models with additional training so the model performs better on specific tasks. This process is called model tuning, and while this technique improves the ability of a model to perform targeted tasks, it can also cause the model to become worse at other tasks. For this reason, Gemma models are available in both instruction tuned and pretrained versions:

  • Pretrained - These versions of the model are not trained on any specific tasks or instructions beyond the Gemma core data training set. You should not deploy these models without performing some tuning.
  • Instruction tuned - These versions of the model are trained with human language interactions and can respond to conversational input, similar to a chat bot.

Get started

Check out these guides to get started building solutions with Gemma: