ShieldGemma is a set of ready-made, instruction-tuned, open weights content classifier models, built on Gemma 2, that can determine whether user-provided, model-generated, or mixed content violates a content safety policy. ShieldGemma is trained to identify four harms—sexual content, dangerous content, harassment, and hate speech—and comes in three size variants—2B, 9B, and 27B parameters—that allow you to balance speed, performance, and generalizability to suit your needs across any deployment. See the model card for more about the difference between these variants.
Safeguard your models with ShieldGemma
Start Google Colab (Keras) | Start Google Colab (Transformers) |
You can use ShieldGemma models in the following frameworks.
- KerasNLP, with model checkpoints available from Kaggle. Check out the ShieldGemma in Keras Colab to get started.
- Hugging Face Transformers, with model checkpoints available from Hugging Face Hub. Check out the ShieldGemma in Transformers Colab to get started.