ShieldGemma

ShieldGemma is a set of instruction tuned models for evaluating the safety of text prompt input and text output responses against a set of defined safety policies. You can use this model as part of a larger implementation of a generative AI application to help evaluate and prevent generative AI applications from violating safety policies.

The ShieldGemma models are built on Gemma 2 in 2B, 9B, and 27B parameter sizes. The model is provided with open weights to allow you to fine-tune it for your specific use case. This set of models and example implementation code is a component of the Responsible Generative AI Toolkit.

  • Evaluate the safety of prompt input and output responses against a set of defined safety policies.
  • ShieldGemma models are provided with open weights and can be fine-tuned for your specific use case.

Learn more

ShieldGemma's model card contains detailed information about the model implementation, evaluations, model usage and limitations, and more.
View more code, Colab notebooks, information, and discussions about ShieldGemma on Kaggle.
Run a working example for using ShieldGemma to evaluate text prompt input and output.