Inferensi Teks Dasar Gemma

Lihat di ai.google.dev Jalankan di Google Colab Jalankan di Kaggle Buka di Vertex AI Lihat sumber di GitHub

Gemma adalah sekumpulan model terbuka yang ringan dan canggih, dibangun dari riset dan teknologi yang digunakan untuk membuat model Gemini. Gemma 4 dirancang untuk menjadi rangkaian model dengan bobot terbuka paling efisien di dunia.

Dokumen ini memberikan panduan untuk melakukan inferensi teks dasar dengan Gemma 4 menggunakan library transformers Hugging Face. Panduan ini mencakup penyiapan lingkungan, pemuatan model, dan berbagai skenario pembuatan teks, termasuk perintah sekali putaran, percakapan multi-putaran terstruktur, dan penerapan petunjuk sistem.

Notebook ini akan berjalan di GPU T4.

Menginstal paket Python

Instal library Hugging Face yang diperlukan untuk menjalankan model Gemma dan membuat permintaan.

# Install PyTorch & other libraries
pip install torch accelerate

# Install the transformers library
pip install transformers

Dialog adalah library untuk memanipulasi dan menampilkan percakapan.

pip install dialog

Memuat Model

Menggunakan library transformers untuk memuat pipeline

MODEL_ID = "google/gemma-4-E2B-it" # @param ["google/gemma-4-E2B-it","google/gemma-4-E4B-it", "google/gemma-4-31B-it", "google/gemma-4-26B-A4B-it"]

from transformers import pipeline

txt_pipe = pipeline(
    task="text-generation",
    model=MODEL_ID,
    device_map="auto",
    dtype="auto"
)
Loading weights:   0%|          | 0/2011 [00:00<?, ?it/s]

Menjalankan pembuatan teks

Setelah model Gemma dimuat dan dikonfigurasi dalam objek pipeline, Anda dapat mengirimkan perintah ke model. Contoh kode berikut menunjukkan permintaan dasar menggunakan parameter text_inputs:

output = txt_pipe(text_inputs="<|turn>user\nRoses are..<turn|>\n<|turn>model\n")
print(output[0]['generated_text'])
Both `max_new_tokens` (=256) and `max_length`(=20) seem to have been set. `max_new_tokens` will take precedence. Please refer to the documentation for more information. (https://huggingface.co/docs/transformers/main/en/main_classes/text_generation)
<|turn>user
Roses are..<turn|>
<|turn>model
Here are a few ways to complete the phrase "Roses are...":

**Classic/Poetic:**

* **Roses are red.** (The most famous completion, though it usually goes "Roses are red, Violets are blue.")
* **Roses are beautiful.**
* **Roses are fragrant.**

**Simple/Direct:**

* **Roses are lovely.**
* **Roses are soft.**

**If you want a specific tone, let me know! 😊**

Menggunakan library Dialog

import dialog
from transformers import GenerationConfig
config = GenerationConfig.from_pretrained(MODEL_ID)
config.max_new_tokens = 512

conv = dialog.Conversation(
    dialog.User("Roses are...")
)
output = txt_pipe(text_inputs=conv.as_text(), return_full_text=False, generation_config=config)
conv += dialog.Model(output[0]['generated_text'])

print(conv.as_text())
conv.show()
<|turn>user
Roses are...<turn|>
<|turn>model
Here are a few ways to complete the phrase "Roses are...":

**Focusing on their beauty:**

* **Roses are beautiful.**
* **Roses are gorgeous.**

**Focusing on their scent:**

* **Roses are fragrant.**
* **Roses are sweet-smelling.**

**Focusing on their symbolism (if you want a deeper meaning):**

* **Roses are love.**
* **Roses are romantic.**

**Focusing on a general observation:**

* **Roses are lovely.**
* **Roses are wonderful.**

**Which completion do you like best, or were you thinking of a specific meaning?**
<dialog._src.widget.Conversation object at 0x7f1bb1a5d8b0>

Menggunakan template perintah

Saat membuat konten dengan perintah yang lebih kompleks, gunakan template perintah untuk menyusun permintaan Anda. Template perintah memungkinkan Anda menentukan input dari peran tertentu, seperti user atau model, dan merupakan format yang diperlukan untuk mengelola interaksi chat multi-giliran dengan model Gemma. Contoh kode berikut menunjukkan cara membuat template perintah untuk Gemma:

from transformers import GenerationConfig
config = GenerationConfig.from_pretrained(MODEL_ID)
config.max_new_tokens = 512

messages = [
    {
        "role": "user",
        "content": [
            {"type": "text", "text": "Write a short poem about the Kraken."},
        ]
    }
]

output = txt_pipe(messages, return_full_text=False, generation_config=config)
print(output[0]['generated_text'])
From sunless depths, a shadow stirs,
Where ocean's crushing silence blurs.
A titan sleeps in inky night,
With tentacles of dreadful might.

A hundred arms, a crushing hold,
A legend whispered, ages old.
The deep's dark king, a monstrous grace,
The Kraken claims its watery space.

Percakapan multi-giliran

Dalam penyiapan multi-turn, histori percakapan dipertahankan sebagai urutan peran user dan model yang bergantian. Daftar kumulatif ini berfungsi sebagai memori model, memastikan bahwa setiap output baru dipengaruhi oleh dialog sebelumnya.

import dialog
from transformers import GenerationConfig
config = GenerationConfig.from_pretrained(MODEL_ID)
config.max_new_tokens = 512

# User turn #1
conv = dialog.Conversation(
    dialog.User("Write a short poem about the Kraken.")
)

# Model response #1
output = txt_pipe(text_inputs=conv.as_text(), return_full_text=False, generation_config=config)
conv += dialog.Model(output[0]['generated_text'])

# User turn #2
conv += dialog.User("Now with the Siren.")

# Model response #2
output = txt_pipe(text_inputs=conv.as_text(), return_full_text=False, generation_config=config)
conv += dialog.Model(output[0]['generated_text'])

print(conv.as_text())
conv.show()
<|turn>user
Write a short poem about the Kraken.<turn|>
<|turn>model
In depths where sunlight fades,
A monstrous shadow plays.
The Kraken wakes, with churning tide,
A living horror, bold and wide.<turn|>
<|turn>user
Now with the Siren.<turn|>
<|turn>model
Where coral gardens sleep,
And ocean secrets keep,
The Siren calls, with liquid grace,
A haunting melody in place.
<dialog._src.widget.Conversation object at 0x7f1bac3733b0>

Berikut percakapan yang diekspor sebagai teks.

chat_history = conv.as_text(training=True)
print(chat_history)
print("-"*80)

# display as Conversation widget
chat_history
<|turn>user
Write a short poem about the Kraken.<turn|>
<|turn>model
In depths where sunlight fades,
A monstrous shadow plays.
The Kraken wakes, with churning tide,
A living horror, bold and wide.<turn|>
<|turn>user
Now with the Siren.<turn|>
<|turn>model
Where coral gardens sleep,
And ocean secrets keep,
The Siren calls, with liquid grace,
A haunting melody in place.<turn|>
--------------------------------------------------------------------------------
<dialog._src.widget.ConversationStr object at 0x7f1bb07fa1b0>

Petunjuk sistem

Gunakan peran system untuk memberikan petunjuk tingkat sistem.

import dialog
from transformers import GenerationConfig
config = GenerationConfig.from_pretrained(MODEL_ID)
config.max_new_tokens = 512

conv = dialog.Conversation(
    dialog.System("Speak like a pirate."),
    dialog.User("Why is the sky blue?")
)

output = txt_pipe(text_inputs=conv.as_text(), return_full_text=False, generation_config=config)
conv += dialog.Model(output[0]['generated_text'])

print(conv.as_text())
conv.show()
<|turn>system
Speak like a pirate.<turn|>
<|turn>user
Why is the sky blue?<turn|>
<|turn>model
Ahoy there! Why is the sky blue, ye ask? It be down to the way the sun's light dances through the air!

See, the sunlight we get from the sun ain't just one color; it's a whole spectrum of colors, like a treasure chest filled with all the hues of the rainbow!

Now, the Earth is surrounded by the air, and that air is full of tiny, invisible bits of gas. When the sunlight hits these gas molecules, something magical happens. The colors in that sunlight get scattered all around in every direction!

The blue light, and other colors, get scattered more easily by these air molecules than the other colors. So, when you look up at the sky, your eyes catch all that scattered blue light coming from every direction, and **that's what makes the sky appear blue to us!**

It's a grand display of physics and light, savvy? Now, hoist the colors and enjoy the view!
<dialog._src.widget.Conversation object at 0x7f1bac370110>

Ringkasan dan langkah berikutnya

Dalam panduan ini, Anda telah mempelajari cara melakukan inferensi teks dasar dengan Gemma 4 menggunakan library transformers Hugging Face. Anda telah mempelajari:

  • Menyiapkan lingkungan dan menginstal dependensi.
  • Memuat model menggunakan abstraksi pipeline.
  • Menjalankan pembuatan teks dasar.
  • Menggunakan library dialog untuk tracking percakapan.
  • Menerapkan percakapan bolak-balik dan menerapkan petunjuk sistem.

Langkah Berikutnya