| | در گوگل کولب اجرا کنید | | | مشاهده منبع در گیتهاب |
Gemma خانوادهای از مدلهای سبک و پیشرفتهی روباز است که از همان تحقیقات و فناوری مورد استفاده در ساخت مدلهای Gemini ساخته شدهاند. Gemma 4 به گونهای طراحی شده است که کارآمدترین خانوادهی مدلهای روباز در جهان باشد.
این سند نحوه استفاده از قابلیتهای تفکر Gemma 4 را برای تولید فرآیندهای استدلال قبل از ارائه پاسخ نهایی نشان میدهد. شما یاد خواهید گرفت که چگونه با استفاده از کتابخانه Hugging Face transformers ، حالت تفکر را برای وظایف فقط متنی و چندوجهی (تصویر-متن) فعال کنید و چگونه خروجی را تجزیه کنید تا تفکر را از پاسخ جدا کنید.
این نوتبوک از پردازنده گرافیکی T4 بهره میبرد.
نصب بستههای پایتون
کتابخانههای Hugging Face مورد نیاز برای اجرای مدل Gemma و ارسال درخواستها را نصب کنید.
# Install PyTorch & other librariespip install torch accelerate# Install the transformers librarypip install -U transformers
مدل بار
از کتابخانههای transformers برای ایجاد یک نمونه از یک processor و model با استفاده از کلاسهای AutoProcessor و AutoModelForImageTextToText همانطور که در مثال کد زیر نشان داده شده است، استفاده کنید:
MODEL_ID = "google/gemma-4-E2B-it" # @param ["google/gemma-4-E2B-it","google/gemma-4-E4B-it", "google/gemma-4-31B-it", "google/gemma-4-26B-A4B-it"]
from transformers import AutoProcessor, AutoModelForMultimodalLM
model = AutoModelForMultimodalLM.from_pretrained(MODEL_ID, dtype="auto", device_map="auto")
processor = AutoProcessor.from_pretrained(MODEL_ID)
model.safetensors: 0%| | 0.00/10.2G [00:00<?, ?B/s] Loading weights: 0%| | 0/1951 [00:00<?, ?it/s] generation_config.json: 0%| | 0.00/208 [00:00<?, ?B/s] processor_config.json: 0.00B [00:00, ?B/s] chat_template.jinja: 0.00B [00:00, ?B/s] tokenizer_config.json: 0.00B [00:00, ?B/s] tokenizer.json: 0%| | 0.00/32.2M [00:00<?, ?B/s]
استنتاج تک متنی با تفکر
برای تولید پاسخ با استفاده از قابلیتهای تفکر مدل، enable_thinking=True را به آن ارسال کنید، پردازنده توکنهای تفکر صحیح را در اعلان وارد میکند و به مدل دستور میدهد قبل از پاسخ دادن فکر کند.
| اندازه مدل | حالت تفکر | ساختار الگو / خروجی |
|---|---|---|
| E2B/E4B | خاموش | <|turn>user\n[Prompt]<turn|>\n<|turn>model |
| E2B/E4B | روشن | <|turn>system\n<|think|><turn|>\n<|turn>user\n[Prompt]<turn|>\n<|turn>model |
| ۲۶ب/۳۱ب | خاموش | ⚠️ <|turn>user\n[Prompt]<turn|>\n<|turn>model\n<|channel>thought\n<channel|> |
| ۲۶ب/۳۱ب | روشن | <|turn>system\n<|think|><turn|>\n<|turn>user\n[Prompt]<turn|>\n<|turn>model |
from transformers import TextStreamer
message = [
{
"role": "user", "content": "What is the water formula?"
}
]
text = processor.apply_chat_template(message, tokenize=False, add_generation_prompt=True, enable_thinking=True)
inputs = processor(text=text, return_tensors="pt").to(model.device)
input_len = inputs["input_ids"].shape[-1]
streamer = TextStreamer(processor)
outputs = model.generate(**inputs, streamer=streamer, max_new_tokens=1024)
<bos><|turn>system
<|think|>
<turn|>
<|turn>user
What is the water formula?<turn|>
<|turn>model
<|channel>thought
Thinking Process:
1. **Analyze the Request:** The user is asking "What is the water formula?"
2. **Interpret Ambiguity:** The term "water formula" is highly ambiguous. It could mean several things:
* **Chemical Formula:** The molecular formula of water (\\(\text{H}_2\text{O}\\)).
* **Hydration Formula/Concept:** How water interacts with solutes or biological systems.
* **Water Cycle Formula:** A diagram or mathematical representation of evaporation, condensation, precipitation.
* **A Specific Context (e.g., cooking, chemistry lab, a specific theory):** The user might be referring to something specific that I don't have context for.
3. **Determine the Most Likely Interpretation (The Default Answer):** In a general knowledge context, when someone asks for a "formula" related to water, they usually mean the chemical structure.
4. **Formulate the Chemical Answer:**
* Chemical Name: Water
* Chemical Formula: \\(\text{H}_2\text{O}\\)
* Structure: Two hydrogen atoms bonded to one oxygen atom.
5. **Address Other Possible Interpretations (To ensure a complete answer):** Since the term is vague, I should acknowledge the different contexts.
6. **Draft the Response:**
* Start with the most direct, scientific answer (the chemical formula).
* Explain what the formula means.
* Briefly mention other related concepts (like the water cycle) if necessary, but focus on the primary meaning.
* Ask for clarification if the user meant something else.
7. **Review against Identity Constraints:** (I am Gemma 4, LLM, open weights, no tools needed). The answer is factual and does not violate any constraints.
8. **Final Output Generation.** (Self-Correction during drafting: Ensure the chemical notation is clear.)<channel|>The term "water formula" can refer to a few different things, depending on the context. Here are the most common interpretations:
### 1. Chemical Formula (The most likely answer)
If you are referring to the **chemical structure** of water (\\(\text{H}_2\text{O}\\)), the formula is:
$$\text{H}_2\text{O}$$
**What this means:**
* **H** stands for Hydrogen.
* **O** stands for Oxygen.
* The formula indicates that one molecule of water is composed of **two hydrogen atoms** chemically bonded to **one oxygen atom**.
***
### 2. The Water Cycle Formula (Scientific Process)
If you are referring to the **water cycle**, which describes the continuous movement of water on, above, and below the surface of the Earth, there isn't a single mathematical formula, but rather a series of physical processes:
* **Evaporation:** Liquid water turns into water vapor (gas) due to heat.
* **Condensation:** Water vapor cools and turns back into liquid water, forming clouds.
* **Precipitation:** Water falls back to Earth in the form of rain, snow, sleet, or hail.
* **Collection/Runoff:** Water gathers in rivers, lakes, oceans, or soaks into the ground.
***
### 3. Hydration Formula (Biology/Chemistry)
In a biological or chemical context, the "formula" might refer to how water interacts with other substances, such as:
* **Solubility:** The ability of water to dissolve other substances (often described by "like dissolves like").
* **Hydration Shells:** The arrangement of water molecules surrounding an ion or molecule.
**Could you please provide a little more context?** Knowing whether you are asking about chemistry, biology, or a specific concept will help me give you the exact formula you are looking for!<turn|>
پس از تولید متن، پاسخ شامل بلوکهای استدلال و پاسخ نهایی خواهد بود که توسط توکنهای ویژه محدود شدهاند. میتوانید از ابزار parse_response برای استخراج آسان آنها در یک دیکشنری حاوی thinking و answer استفاده کنید.
response = processor.decode(outputs[0][input_len:], skip_special_tokens=False)
result = processor.parse_response(response)
for key, value in result.items():
if key == "role":
print(f"Role: {value}")
elif key == "thinking":
print(f"\n=== Thoughts ===\n{value}")
elif key == "content":
print(f"\n=== Answer ===\n{value}")
elif key == "tool_calls":
print(f"\n=== Tool Calls ===\n{value}")
else:
print(f"\n{key}: {value}...\n")
Role: assistant
=== Thoughts ===
Thinking Process:
1. **Analyze the Request:** The user is asking "What is the water formula?"
2. **Interpret Ambiguity:** The term "water formula" is highly ambiguous. It could mean several things:
* **Chemical Formula:** The molecular formula of water (\\(\text{H}_2\text{O}\\)).
* **Hydration Formula/Concept:** How water interacts with solutes or biological systems.
* **Water Cycle Formula:** A diagram or mathematical representation of evaporation, condensation, precipitation.
* **A Specific Context (e.g., cooking, chemistry lab, a specific theory):** The user might be referring to something specific that I don't have context for.
3. **Determine the Most Likely Interpretation (The Default Answer):** In a general knowledge context, when someone asks for a "formula" related to water, they usually mean the chemical structure.
4. **Formulate the Chemical Answer:**
* Chemical Name: Water
* Chemical Formula: \\(\text{H}_2\text{O}\\)
* Structure: Two hydrogen atoms bonded to one oxygen atom.
5. **Address Other Possible Interpretations (To ensure a complete answer):** Since the term is vague, I should acknowledge the different contexts.
6. **Draft the Response:**
* Start with the most direct, scientific answer (the chemical formula).
* Explain what the formula means.
* Briefly mention other related concepts (like the water cycle) if necessary, but focus on the primary meaning.
* Ask for clarification if the user meant something else.
7. **Review against Identity Constraints:** (I am Gemma 4, LLM, open weights, no tools needed). The answer is factual and does not violate any constraints.
8. **Final Output Generation.** (Self-Correction during drafting: Ensure the chemical notation is clear.)
=== Answer ===
The term "water formula" can refer to a few different things, depending on the context. Here are the most common interpretations:
### 1. Chemical Formula (The most likely answer)
If you are referring to the **chemical structure** of water (\\(\text{H}_2\text{O}\\)), the formula is:
$$\text{H}_2\text{O}$$
**What this means:**
* **H** stands for Hydrogen.
* **O** stands for Oxygen.
* The formula indicates that one molecule of water is composed of **two hydrogen atoms** chemically bonded to **one oxygen atom**.
***
### 2. The Water Cycle Formula (Scientific Process)
If you are referring to the **water cycle**, which describes the continuous movement of water on, above, and below the surface of the Earth, there isn't a single mathematical formula, but rather a series of physical processes:
* **Evaporation:** Liquid water turns into water vapor (gas) due to heat.
* **Condensation:** Water vapor cools and turns back into liquid water, forming clouds.
* **Precipitation:** Water falls back to Earth in the form of rain, snow, sleet, or hail.
* **Collection/Runoff:** Water gathers in rivers, lakes, oceans, or soaks into the ground.
***
### 3. Hydration Formula (Biology/Chemistry)
In a biological or chemical context, the "formula" might refer to how water interacts with other substances, such as:
* **Solubility:** The ability of water to dissolve other substances (often described by "like dissolves like").
* **Hydration Shells:** The arrangement of water molecules surrounding an ion or molecule.
**Could you please provide a little more context?** Knowing whether you are asking about chemistry, biology, or a specific concept will help me give you the exact formula you are looking for!
مثال چند نوبتی با حذف افکار
مدیریت صحیح افکار تولید شده توسط مدل برای حفظ عملکرد در مکالمات چند نوبتی بسیار مهم است.
- مکالمات چند نوبتی استاندارد: شما باید افکار تولید شده مدل را از نوبت قبلی قبل از ارسال تاریخچه مکالمه به مدل برای نوبت بعدی حذف کنید (از بین ببرید). اگر میخواهید حالت تفکر را در اواسط مکالمه غیرفعال کنید، میتوانید هنگام حذف افکار قبلی، توکن
<|think|>را حذف کنید. - فراخوانی تابع (استثنا): اگر یک نوبت مدل شامل فراخوانی تابع یا ابزار باشد، نباید بین فراخوانیهای تابع، ایدهها حذف شوند.
- نگهداری تاریخچه مکالمات: خروجی مدل تاریخچه فقط باید شامل پاسخ نهایی باشد. اطمینان حاصل کنید که هیچ فکر تولید شده از نوبتهای قبلی قبل از شروع نوبت بعدی کاربر در پنجره زمینه باقی نمانده باشد.
# Append the clean response to the message history
message.append({"role": "assistant", "content": result["content"]})
# ==========================================
# TURN 2
# ==========================================
print("\n--- Turn 2 ---")
# Add the next user query to the history
message.append({
"role": "user",
"content": "What is its boiling point in Celsius?"
})
text = processor.apply_chat_template(message, tokenize=False, add_generation_prompt=True, enable_thinking=True)
inputs = processor(text=text, return_tensors="pt").to(model.device)
input_len = inputs["input_ids"].shape[-1]
streamer = TextStreamer(processor)
outputs = model.generate(**inputs, streamer=streamer, max_new_tokens=1024)
--- Turn 2 ---
<bos><|turn>system
<|think|>
<turn|>
<|turn>user
What is the water formula?<turn|>
<|turn>model
The term "water formula" can refer to a few different things, depending on the context. Here are the most common interpretations:
### 1. Chemical Formula (The most likely answer)
If you are referring to the **chemical structure** of water (\\(\text{H}_2\text{O}\\)), the formula is:
$$\text{H}_2\text{O}$$
**What this means:**
* **H** stands for Hydrogen.
* **O** stands for Oxygen.
* The formula indicates that one molecule of water is composed of **two hydrogen atoms** chemically bonded to **one oxygen atom**.
***
### 2. The Water Cycle Formula (Scientific Process)
If you are referring to the **water cycle**, which describes the continuous movement of water on, above, and below the surface of the Earth, there isn't a single mathematical formula, but rather a series of physical processes:
* **Evaporation:** Liquid water turns into water vapor (gas) due to heat.
* **Condensation:** Water vapor cools and turns back into liquid water, forming clouds.
* **Precipitation:** Water falls back to Earth in the form of rain, snow, sleet, or hail.
* **Collection/Runoff:** Water gathers in rivers, lakes, oceans, or soaks into the ground.
***
### 3. Hydration Formula (Biology/Chemistry)
In a biological or chemical context, the "formula" might refer to how water interacts with other substances, such as:
* **Solubility:** The ability of water to dissolve other substances (often described by "like dissolves like").
* **Hydration Shells:** The arrangement of water molecules surrounding an ion or molecule.
**Could you please provide a little more context?** Knowing whether you are asking about chemistry, biology, or a specific concept will help me give you the exact formula you are looking for!<turn|>
<|turn>user
What is its boiling point in Celsius?<turn|>
<|turn>model
<|channel>thought
Thinking Process:
1. **Identify the core question:** The user is asking for the boiling point of water in Celsius.
2. **Recall/Verify the standard scientific fact:** The standard boiling point of pure water at standard atmospheric pressure is a well-known physical constant.
3. **State the answer clearly:** The boiling point of water is \\(100^\circ\text{C}\\).
4. **Add necessary context/caveats (Crucial step for accuracy):** Boiling point is dependent on pressure. I must mention this to provide a complete and scientifically accurate answer.
* What happens at different altitudes (lower pressure)? (Boiling point drops).
* What happens under high pressure? (Boiling point increases).
5. **Final Review:** The answer is direct, accurate, and provides essential context. (Self-correction: Ensure the units are explicitly stated as Celsius.)<channel|>The boiling point of pure water at standard atmospheric pressure (sea level) is:
$$\mathbf{100^\circ\text{C} }$$ (Degrees Celsius)
**Important Note:**
The boiling point of water is dependent on **pressure**.
* **Higher Pressure:** If the pressure above the water increases (e.g., in a pressure cooker), the boiling point will be **higher** than \\(100^\circ\text{C}\\).
* **Lower Pressure:** If the pressure decreases (e.g., at high altitudes), the boiling point will be **lower** than \\(100^\circ\text{C}\\).<turn|>
response = processor.decode(outputs[0][input_len:], skip_special_tokens=False)
result = processor.parse_response(response)
for key, value in result.items():
if key == "role":
print(f"Role: {value}")
elif key == "thinking":
print(f"\n=== Thoughts ===\n{value}")
elif key == "content":
print(f"\n=== Answer ===\n{value}")
elif key == "tool_calls":
print(f"\n=== Tool Calls ===\n{value}")
else:
print(f"\n{key}: {value}...\n")
Role: assistant
=== Thoughts ===
Thinking Process:
1. **Identify the core question:** The user is asking for the boiling point of water in Celsius.
2. **Recall/Verify the standard scientific fact:** The standard boiling point of pure water at standard atmospheric pressure is a well-known physical constant.
3. **State the answer clearly:** The boiling point of water is \\(100^\circ\text{C}\\).
4. **Add necessary context/caveats (Crucial step for accuracy):** Boiling point is dependent on pressure. I must mention this to provide a complete and scientifically accurate answer.
* What happens at different altitudes (lower pressure)? (Boiling point drops).
* What happens under high pressure? (Boiling point increases).
5. **Final Review:** The answer is direct, accurate, and provides essential context. (Self-correction: Ensure the units are explicitly stated as Celsius.)
=== Answer ===
The boiling point of pure water at standard atmospheric pressure (sea level) is:
$$\mathbf{100^\circ\text{C} }$$ (Degrees Celsius)
**Important Note:**
The boiling point of water is dependent on **pressure**.
* **Higher Pressure:** If the pressure above the water increases (e.g., in a pressure cooker), the boiling point will be **higher** than \\(100^\circ\text{C}\\).
* **Lower Pressure:** If the pressure decreases (e.g., at high altitudes), the boiling point will be **lower** than \\(100^\circ\text{C}\\).
استنتاج تک تصویری
روش استفاده از مدل تفکر با دادههای بصری بسیار مشابه است. میتوانید یک تصویر را به عنوان بخشی از آرایه messages ارائه دهید. فقط مطمئن شوید که تصویر را به همراه متن قالببندی شده به پردازنده ارسال میکنید و مدل قبل از پاسخ دادن، ورودی بصری را استدلال میکند.
from PIL import Image
import matplotlib.pyplot as plt
prompt = "What is shown in this image?"
image_url = "https://raw.githubusercontent.com/google-gemma/cookbook/refs/heads/main/apps/sample-data/GoldenGate.png"
# download image
!wget -q {image_url} -O image.png
image = Image.open("image.png")
# Display all images
print("=== Downloaded image ===")
fig, ax = plt.subplots(1, 1, figsize=(5, 5))
ax.imshow(image)
ax.set_title("Image 1")
ax.axis("off")
plt.tight_layout()
plt.show()
message = [
{
"role": "user", "content": [
{"type": "image"},
{"type": "text", "text": prompt}
]
}
]
text = processor.apply_chat_template(message, tokenize=False, add_generation_prompt=True, enable_thinking=True)
inputs = processor(text=text, images=image, return_tensors="pt").to(model.device)
input_len = inputs["input_ids"].shape[-1]
outputs = model.generate(**inputs, max_new_tokens=1024)
response = processor.decode(outputs[0][input_len:], skip_special_tokens=False)
result = processor.parse_response(response)
for key, value in result.items():
if key == "role":
print(f"Role: {value}")
elif key == "thinking":
print(f"\n=== Thoughts ===\n{value}")
elif key == "content":
print(f"\n=== Answer ===\n{value}")
elif key == "tool_calls":
print(f"\n=== Tool Calls ===\n{value}")
else:
print(f"\n{key}: {value}...\n")
=== Downloaded image ===

Role: assistant
=== Thoughts ===
Here's a thinking process to arrive at the suggested answer:
1. **Analyze the Image:**
* **Dominant Structure:** The most prominent feature is a massive suspension bridge with tall red towers (pylons). This is immediately recognizable as the Golden Gate Bridge.
* **Foreground/Midground:** There is water (looks like the San Francisco Bay/Pacific Ocean). There is a breakwater/rocky shoreline in the immediate foreground. A small rock formation is visible in the water.
* **Background/Landmass:** Hills/mountains are visible in the distance, behind the bridge structure.
* **Other Elements:** There is a low-lying building/structure on the left (part of the bridge infrastructure or a related facility). The sky is clear and blue.
2. **Identify Key Subject:** The central subject is the Golden Gate Bridge.
3. **Determine Setting/Location (Contextualization):** Since it's the Golden Gate Bridge, the location is San Francisco, California.
4. **Synthesize the Description (Drafting the Answer):** Start with the main subject and then add details about the setting.
* *Initial thought:* It's a picture of the Golden Gate Bridge over the water.
* *Refinement (Adding detail):* It shows the iconic red suspension bridge spanning a body of water, with land/hills in the background.
5. **Review the Provided Text (Self-Correction/Verification):** The prompt includes some garbled text ("ołat" at the end), but the visual evidence is overwhelming. The task is simply to describe the image.
6. **Final Output Generation:** (This leads to the structured, informative response.) (The final response should clearly identify the landmark.)
=== Answer ===
The image shows the **Golden Gate Bridge** in San Francisco, California.
Key elements visible in the picture are:
* **The Golden Gate Bridge:** The iconic red suspension bridge dominates the frame, spanning a large body of water.
* **Water:** The foreground features dark blue water, likely the Pacific Ocean or the San Francisco Bay.
* **Landmass/Hills:** Hills and mountains are visible in the background behind the bridge.
* **Foreground Detail:** There is a rocky shoreline and a small rock formation in the water in the immediate foreground.
In summary, it is a scenic view of the famous Golden Gate Bridge.
خلاصه و مراحل بعدی
در این راهنما، شما یاد گرفتید که چگونه از قابلیتهای تفکر مدلهای Gemma 4 برای تولید فرآیندهای استدلال قبل از پاسخهای نهایی استفاده کنید. شما موارد زیر را پوشش دادید:
- فعال کردن حالت تفکر با استفاده از
enable_thinking=Trueدرapply_chat_template. - استفاده از
TextStreamerبرای مشاهده فرآیند تفکر در زمان واقعی. - تجزیه خروجی ترکیبی به بلوکهای
thinkingوanswerجداگانه با استفاده ازparse_response. - به کارگیری قابلیتهای تفکر در وظایف چندوجهی (تصویر + متن).
مراحل بعدی
قابلیتهای بیشتر Gemma 4 را بررسی کنید:
در گوگل کولب اجرا کنید
مشاهده منبع در گیتهاب