Gemma 4 wurde veröffentlicht und unterstützt Text-, Audio- und Bildeingaben sowie ein langes Kontextfenster mit bis zu 256.000 Tokens. Weitere Informationen

Google uses AI technology to translate content into your preferred language. AI translations can contain errors.

Gemma Basic Text Inference

Auf ai.google.dev ansehen

In Google Colab ausführen

In Kaggle ausführen

In Vertex AI öffnen

Quelle auf GitHub ansehen

Gemma ist eine Familie leichtgewichtiger, hochmoderner offener Modelle, die auf derselben Forschung und Technologie basieren, die auch für die Erstellung der Gemini-Modelle verwendet werden. Gemma 4 ist die weltweit effizienteste Familie von Open-Weight-Modellen.

In diesem Dokument wird beschrieben, wie Sie mit der transformers-Bibliothek von Hugging Face grundlegende Textinferenz mit Gemma 4 durchführen. Darin werden die Einrichtung der Umgebung, das Laden von Modellen und verschiedene Szenarien für die Textgenerierung behandelt, darunter Single-Turn-Prompts, strukturierte Multi-Turn-Unterhaltungen und die Anwendung von Systemanweisungen.

Dieses Notebook wird auf einer T4-GPU ausgeführt.

Python-Pakete installieren

Installieren Sie die Hugging Face-Bibliotheken, die zum Ausführen des Gemma-Modells und zum Senden von Anfragen erforderlich sind.

# Install PyTorch & other libraries
pip install torch accelerate

# Install the transformers library
pip install transformers

Dialog ist eine Bibliothek zum Bearbeiten und Anzeigen von Unterhaltungen.

pip install dialog

Modell laden

transformers-Bibliothek zum Laden der Pipeline verwenden

MODEL_ID = "google/gemma-4-E2B-it" # @param ["google/gemma-4-E2B-it","google/gemma-4-E4B-it", "google/gemma-4-31B-it", "google/gemma-4-26B-A4B-it"]

from transformers import pipeline

txt_pipe = pipeline(
    task="text-generation",
    model=MODEL_ID,
    device_map="auto",
    dtype="auto"
)

Loading weights:   0%|          | 0/2011 [00:00<?, ?it/s]

Text generieren

Sobald Sie das Gemma-Modell in einem pipeline-Objekt geladen und konfiguriert haben, können Sie Prompts an das Modell senden. Der folgende Beispielcode zeigt eine einfache Anfrage mit dem Parameter text_inputs:

output = txt_pipe(text_inputs="<|turn>user\nRoses are..<turn|>\n<|turn>model\n")
print(output[0]['generated_text'])

Both `max_new_tokens` (=256) and `max_length`(=20) seem to have been set. `max_new_tokens` will take precedence. Please refer to the documentation for more information. (https://huggingface.co/docs/transformers/main/en/main_classes/text_generation)
<|turn>user
Roses are..<turn|>
<|turn>model
Here are a few ways to complete the phrase "Roses are...":

**Classic/Poetic:**

* **Roses are red.** (The most famous completion, though it usually goes "Roses are red, Violets are blue.")
* **Roses are beautiful.**
* **Roses are fragrant.**

**Simple/Direct:**

* **Roses are lovely.**
* **Roses are soft.**

**If you want a specific tone, let me know! 😊**

Dialogbibliothek verwenden

import dialog
from transformers import GenerationConfig
config = GenerationConfig.from_pretrained(MODEL_ID)
config.max_new_tokens = 512

conv = dialog.Conversation(
    dialog.User("Roses are...")
)
output = txt_pipe(text_inputs=conv.as_text(), return_full_text=False, generation_config=config)
conv += dialog.Model(output[0]['generated_text'])

print(conv.as_text())
conv.show()

<|turn>user
Roses are...<turn|>
<|turn>model
Here are a few ways to complete the phrase "Roses are...":

**Focusing on their beauty:**

* **Roses are beautiful.**
* **Roses are gorgeous.**

**Focusing on their scent:**

* **Roses are fragrant.**
* **Roses are sweet-smelling.**

**Focusing on their symbolism (if you want a deeper meaning):**

* **Roses are love.**
* **Roses are romantic.**

**Focusing on a general observation:**

* **Roses are lovely.**
* **Roses are wonderful.**

**Which completion do you like best, or were you thinking of a specific meaning?**
<dialog._src.widget.Conversation object at 0x7f1bb1a5d8b0>

Eingabeaufforderungsvorlage verwenden

Wenn Sie Inhalte mit komplexeren Prompts generieren, verwenden Sie eine Prompt-Vorlage, um Ihre Anfrage zu strukturieren. Mit einer Prompt-Vorlage können Sie Eingaben von bestimmten Rollen wie user oder model angeben. Sie ist ein erforderliches Format für die Verwaltung von Chat-Interaktionen mit mehreren Durchgängen mit Gemma-Modellen. Das folgende Beispiel zeigt, wie Sie eine Prompt-Vorlage für Gemma erstellen:

from transformers import GenerationConfig
config = GenerationConfig.from_pretrained(MODEL_ID)
config.max_new_tokens = 512

messages = [
    {
        "role": "user",
        "content": [
            {"type": "text", "text": "Write a short poem about the Kraken."},
        ]
    }
]

output = txt_pipe(messages, return_full_text=False, generation_config=config)
print(output[0]['generated_text'])

From sunless depths, a shadow stirs,
Where ocean's crushing silence blurs.
A titan sleeps in inky night,
With tentacles of dreadful might.

A hundred arms, a crushing hold,
A legend whispered, ages old.
The deep's dark king, a monstrous grace,
The Kraken claims its watery space.

Multi-Turn Conversation

Bei einem Multi-Turn-Setup wird der Unterhaltungsverlauf als Folge von abwechselnden user- und model-Rollen beibehalten. Diese kumulative Liste dient als Gedächtnis des Modells und sorgt dafür, dass jede neue Ausgabe auf dem vorherigen Dialog basiert.

import dialog
from transformers import GenerationConfig
config = GenerationConfig.from_pretrained(MODEL_ID)
config.max_new_tokens = 512

# User turn #1
conv = dialog.Conversation(
    dialog.User("Write a short poem about the Kraken.")
)

# Model response #1
output = txt_pipe(text_inputs=conv.as_text(), return_full_text=False, generation_config=config)
conv += dialog.Model(output[0]['generated_text'])

# User turn #2
conv += dialog.User("Now with the Siren.")

# Model response #2
output = txt_pipe(text_inputs=conv.as_text(), return_full_text=False, generation_config=config)
conv += dialog.Model(output[0]['generated_text'])

print(conv.as_text())
conv.show()

<|turn>user
Write a short poem about the Kraken.<turn|>
<|turn>model
In depths where sunlight fades,
A monstrous shadow plays.
The Kraken wakes, with churning tide,
A living horror, bold and wide.<turn|>
<|turn>user
Now with the Siren.<turn|>
<|turn>model
Where coral gardens sleep,
And ocean secrets keep,
The Siren calls, with liquid grace,
A haunting melody in place.
<dialog._src.widget.Conversation object at 0x7f1bac3733b0>

Hier ist die Konversation als Text exportiert.

Hinweis :Wenn Sie training=True festlegen, wird davon ausgegangen, dass die Unterhaltung das vollständige Beispiel ist. Endet immer mit <turn|>

chat_history = conv.as_text(training=True)
print(chat_history)
print("-"*80)

# display as Conversation widget
chat_history

<|turn>user
Write a short poem about the Kraken.<turn|>
<|turn>model
In depths where sunlight fades,
A monstrous shadow plays.
The Kraken wakes, with churning tide,
A living horror, bold and wide.<turn|>
<|turn>user
Now with the Siren.<turn|>
<|turn>model
Where coral gardens sleep,
And ocean secrets keep,
The Siren calls, with liquid grace,
A haunting melody in place.<turn|>
--------------------------------------------------------------------------------
<dialog._src.widget.ConversationStr object at 0x7f1bb07fa1b0>

Systemanweisungen

Verwenden Sie die Rolle system, um Anweisungen auf Systemebene zu geben.

import dialog
from transformers import GenerationConfig
config = GenerationConfig.from_pretrained(MODEL_ID)
config.max_new_tokens = 512

conv = dialog.Conversation(
    dialog.System("Speak like a pirate."),
    dialog.User("Why is the sky blue?")
)

output = txt_pipe(text_inputs=conv.as_text(), return_full_text=False, generation_config=config)
conv += dialog.Model(output[0]['generated_text'])

print(conv.as_text())
conv.show()

<|turn>system
Speak like a pirate.<turn|>
<|turn>user
Why is the sky blue?<turn|>
<|turn>model
Ahoy there! Why is the sky blue, ye ask? It be down to the way the sun's light dances through the air!

See, the sunlight we get from the sun ain't just one color; it's a whole spectrum of colors, like a treasure chest filled with all the hues of the rainbow!

Now, the Earth is surrounded by the air, and that air is full of tiny, invisible bits of gas. When the sunlight hits these gas molecules, something magical happens. The colors in that sunlight get scattered all around in every direction!

The blue light, and other colors, get scattered more easily by these air molecules than the other colors. So, when you look up at the sky, your eyes catch all that scattered blue light coming from every direction, and **that's what makes the sky appear blue to us!**

It's a grand display of physics and light, savvy? Now, hoist the colors and enjoy the view!
<dialog._src.widget.Conversation object at 0x7f1bac370110>

Zusammenfassung und nächste Schritte

In diesem Leitfaden haben Sie gelernt, wie Sie mit der transformers-Bibliothek von Hugging Face grundlegende Textinferenz mit Gemma 4 durchführen. Gelernte Inhalte:

Umgebung einrichten und Abhängigkeiten installieren
Laden des Modells mit der pipeline-Abstraktion.
Grundlegende Textgenerierung
Verwendung der dialog-Bibliothek für das Conversation-Tracking.
Implementierung von Multi-Turn-Unterhaltungen und Anwendung von Systemanweisungen.

Gemma Basic Text Inference

Python-Pakete installieren

Modell laden

Text generieren

Dialogbibliothek verwenden

Eingabeaufforderungsvorlage verwenden

Multi-Turn Conversation

Systemanweisungen

Zusammenfassung und nächste Schritte

Nächste Schritte