Gemma Scope

Analyze Gemma 3 with Gemma Scope 2

Gemma Scope 2 is a comprehensive, open suite of interpretability tools designed for the Gemma 3 model collection. This tool allows you to examine the behavior of individual layers. It allows researchers to analyze complex language model behaviors and debug emergent behaviors such as jailbreaks or hallucinations.

This toolkit acts as a microscope for the model, providing Sparse Autoencoders (SAEs) and Transcoders trained on every layer of the Gemma 3 family.

Looking for the previous version?
The original Gemma Scope (for Gemma 2) remains available for researchers working with the Gemma 2 family of models.

  • Use SAEs and Transcoders to analyze complex internal behaviors and multi-step algorithms in Gemma 3.
  • Analyze specific chat behaviors, refusal mechanisms, and chain-of-thought faithfulness to build safer AI agents.

Learn more

Read about the new architecture, training data, and capabilities of Gemma Scope 2.
Access the weights, code, and documentation for the Gemma 3 interpretability suite.
Try the interactive tutorial to visualize features and modify model behavior.
Access the blog and resources for the original Gemma Scope for Gemma 2.