حالت تفکر در جما

مشاهده در ai.google.dev در گوگل کولب اجرا کنید دویدن در کاگل باز کردن در Vertex AI مشاهده منبع در گیت‌هاب

Gemma خانواده‌ای از مدل‌های سبک و پیشرفته‌ی روباز است که از همان تحقیقات و فناوری مورد استفاده در ساخت مدل‌های Gemini ساخته شده‌اند. Gemma 4 به گونه‌ای طراحی شده است که کارآمدترین خانواده‌ی مدل‌های روباز در جهان باشد.

این سند نحوه استفاده از قابلیت‌های تفکر Gemma 4 را برای تولید فرآیندهای استدلال قبل از ارائه پاسخ نهایی نشان می‌دهد. شما یاد خواهید گرفت که چگونه با استفاده از کتابخانه Hugging Face transformers ، حالت تفکر را برای وظایف فقط متنی و چندوجهی (تصویر-متن) فعال کنید و چگونه خروجی را تجزیه کنید تا تفکر را از پاسخ جدا کنید.

این نوت‌بوک از پردازنده گرافیکی T4 بهره می‌برد.

نصب بسته‌های پایتون

کتابخانه‌های Hugging Face مورد نیاز برای اجرای مدل Gemma و ارسال درخواست‌ها را نصب کنید.

# Install PyTorch & other libraries
pip install torch accelerate

# Install the transformers library
pip install -U transformers

مدل بار

از کتابخانه‌های transformers برای ایجاد یک نمونه از یک processor و model با استفاده از کلاس‌های AutoProcessor و AutoModelForImageTextToText همانطور که در مثال کد زیر نشان داده شده است، استفاده کنید:

MODEL_ID = "google/gemma-4-E2B-it" # @param ["google/gemma-4-E2B-it","google/gemma-4-E4B-it", "google/gemma-4-31B-it", "google/gemma-4-26B-A4B-it"]

from transformers import AutoProcessor, AutoModelForMultimodalLM

model = AutoModelForMultimodalLM.from_pretrained(MODEL_ID, dtype="auto", device_map="auto")
processor = AutoProcessor.from_pretrained(MODEL_ID)
model.safetensors:   0%|          | 0.00/10.2G [00:00<?, ?B/s]
Loading weights:   0%|          | 0/1951 [00:00<?, ?it/s]
generation_config.json:   0%|          | 0.00/208 [00:00<?, ?B/s]
processor_config.json: 0.00B [00:00, ?B/s]
chat_template.jinja: 0.00B [00:00, ?B/s]
tokenizer_config.json: 0.00B [00:00, ?B/s]
tokenizer.json:   0%|          | 0.00/32.2M [00:00<?, ?B/s]

استنتاج تک متنی با تفکر

برای تولید پاسخ با استفاده از قابلیت‌های تفکر مدل، enable_thinking=True را به آن ارسال کنید، پردازنده توکن‌های تفکر صحیح را در اعلان وارد می‌کند و به مدل دستور می‌دهد قبل از پاسخ دادن فکر کند.

اندازه مدل حالت تفکر ساختار الگو / خروجی
E2B/E4B خاموش <|turn>user\n[Prompt]<turn|>\n<|turn>model
E2B/E4B روشن <|turn>system\n<|think|><turn|>\n<|turn>user\n[Prompt]<turn|>\n<|turn>model
۲۶ب/۳۱ب خاموش ⚠️ <|turn>user\n[Prompt]<turn|>\n<|turn>model\n<|channel>thought\n<channel|>
۲۶ب/۳۱ب روشن <|turn>system\n<|think|><turn|>\n<|turn>user\n[Prompt]<turn|>\n<|turn>model
from transformers import TextStreamer

message = [
    {
        "role": "user", "content": "What is the water formula?"
    }
]

text = processor.apply_chat_template(message, tokenize=False, add_generation_prompt=True, enable_thinking=True)
inputs = processor(text=text, return_tensors="pt").to(model.device)
input_len = inputs["input_ids"].shape[-1]

streamer = TextStreamer(processor)
outputs = model.generate(**inputs, streamer=streamer, max_new_tokens=1024)
<bos><|turn>system
<|think|>
<turn|>
<|turn>user
What is the water formula?<turn|>
<|turn>model
<|channel>thought
Thinking Process:

1.  **Analyze the Request:** The user is asking "What is the water formula?"

2.  **Interpret Ambiguity:** The term "water formula" is highly ambiguous. It could mean several things:
    *   **Chemical Formula:** The molecular formula of water (\\(\text{H}_2\text{O}\\)).
    *   **Hydration Formula/Concept:** How water interacts with solutes or biological systems.
    *   **Water Cycle Formula:** A diagram or mathematical representation of evaporation, condensation, precipitation.
    *   **A Specific Context (e.g., cooking, chemistry lab, a specific theory):** The user might be referring to something specific that I don't have context for.

3.  **Determine the Most Likely Interpretation (The Default Answer):** In a general knowledge context, when someone asks for a "formula" related to water, they usually mean the chemical structure.

4.  **Formulate the Chemical Answer:**
    *   Chemical Name: Water
    *   Chemical Formula: \\(\text{H}_2\text{O}\\)
    *   Structure: Two hydrogen atoms bonded to one oxygen atom.

5.  **Address Other Possible Interpretations (To ensure a complete answer):** Since the term is vague, I should acknowledge the different contexts.

6.  **Draft the Response:**
    *   Start with the most direct, scientific answer (the chemical formula).
    *   Explain what the formula means.
    *   Briefly mention other related concepts (like the water cycle) if necessary, but focus on the primary meaning.
    *   Ask for clarification if the user meant something else.

7.  **Review against Identity Constraints:** (I am Gemma 4, LLM, open weights, no tools needed). The answer is factual and does not violate any constraints.

8.  **Final Output Generation.** (Self-Correction during drafting: Ensure the chemical notation is clear.)<channel|>The term "water formula" can refer to a few different things, depending on the context. Here are the most common interpretations:

### 1. Chemical Formula (The most likely answer)

If you are referring to the **chemical structure** of water (\\(\text{H}_2\text{O}\\)), the formula is:


$$\text{H}_2\text{O}$$


**What this means:**

*   **H** stands for Hydrogen.
*   **O** stands for Oxygen.
*   The formula indicates that one molecule of water is composed of **two hydrogen atoms** chemically bonded to **one oxygen atom**.

***

### 2. The Water Cycle Formula (Scientific Process)

If you are referring to the **water cycle**, which describes the continuous movement of water on, above, and below the surface of the Earth, there isn't a single mathematical formula, but rather a series of physical processes:

*   **Evaporation:** Liquid water turns into water vapor (gas) due to heat.
*   **Condensation:** Water vapor cools and turns back into liquid water, forming clouds.
*   **Precipitation:** Water falls back to Earth in the form of rain, snow, sleet, or hail.
*   **Collection/Runoff:** Water gathers in rivers, lakes, oceans, or soaks into the ground.

***

### 3. Hydration Formula (Biology/Chemistry)

In a biological or chemical context, the "formula" might refer to how water interacts with other substances, such as:

*   **Solubility:** The ability of water to dissolve other substances (often described by "like dissolves like").
*   **Hydration Shells:** The arrangement of water molecules surrounding an ion or molecule.

**Could you please provide a little more context?** Knowing whether you are asking about chemistry, biology, or a specific concept will help me give you the exact formula you are looking for!<turn|>

پس از تولید متن، پاسخ شامل بلوک‌های استدلال و پاسخ نهایی خواهد بود که توسط توکن‌های ویژه محدود شده‌اند. می‌توانید از ابزار parse_response برای استخراج آسان آنها در یک دیکشنری حاوی thinking و answer استفاده کنید.

response = processor.decode(outputs[0][input_len:], skip_special_tokens=False)
result = processor.parse_response(response)

for key, value in result.items():
  if key == "role":
    print(f"Role: {value}")
  elif key == "thinking":
    print(f"\n=== Thoughts ===\n{value}")
  elif key == "content":
    print(f"\n=== Answer ===\n{value}")
  elif key == "tool_calls":
    print(f"\n=== Tool Calls ===\n{value}")
  else:
    print(f"\n{key}: {value}...\n")
Role: assistant

=== Thoughts ===
Thinking Process:

1.  **Analyze the Request:** The user is asking "What is the water formula?"

2.  **Interpret Ambiguity:** The term "water formula" is highly ambiguous. It could mean several things:
    *   **Chemical Formula:** The molecular formula of water (\\(\text{H}_2\text{O}\\)).
    *   **Hydration Formula/Concept:** How water interacts with solutes or biological systems.
    *   **Water Cycle Formula:** A diagram or mathematical representation of evaporation, condensation, precipitation.
    *   **A Specific Context (e.g., cooking, chemistry lab, a specific theory):** The user might be referring to something specific that I don't have context for.

3.  **Determine the Most Likely Interpretation (The Default Answer):** In a general knowledge context, when someone asks for a "formula" related to water, they usually mean the chemical structure.

4.  **Formulate the Chemical Answer:**
    *   Chemical Name: Water
    *   Chemical Formula: \\(\text{H}_2\text{O}\\)
    *   Structure: Two hydrogen atoms bonded to one oxygen atom.

5.  **Address Other Possible Interpretations (To ensure a complete answer):** Since the term is vague, I should acknowledge the different contexts.

6.  **Draft the Response:**
    *   Start with the most direct, scientific answer (the chemical formula).
    *   Explain what the formula means.
    *   Briefly mention other related concepts (like the water cycle) if necessary, but focus on the primary meaning.
    *   Ask for clarification if the user meant something else.

7.  **Review against Identity Constraints:** (I am Gemma 4, LLM, open weights, no tools needed). The answer is factual and does not violate any constraints.

8.  **Final Output Generation.** (Self-Correction during drafting: Ensure the chemical notation is clear.)

=== Answer ===
The term "water formula" can refer to a few different things, depending on the context. Here are the most common interpretations:

### 1. Chemical Formula (The most likely answer)

If you are referring to the **chemical structure** of water (\\(\text{H}_2\text{O}\\)), the formula is:


$$\text{H}_2\text{O}$$


**What this means:**

*   **H** stands for Hydrogen.
*   **O** stands for Oxygen.
*   The formula indicates that one molecule of water is composed of **two hydrogen atoms** chemically bonded to **one oxygen atom**.

***

### 2. The Water Cycle Formula (Scientific Process)

If you are referring to the **water cycle**, which describes the continuous movement of water on, above, and below the surface of the Earth, there isn't a single mathematical formula, but rather a series of physical processes:

*   **Evaporation:** Liquid water turns into water vapor (gas) due to heat.
*   **Condensation:** Water vapor cools and turns back into liquid water, forming clouds.
*   **Precipitation:** Water falls back to Earth in the form of rain, snow, sleet, or hail.
*   **Collection/Runoff:** Water gathers in rivers, lakes, oceans, or soaks into the ground.

***

### 3. Hydration Formula (Biology/Chemistry)

In a biological or chemical context, the "formula" might refer to how water interacts with other substances, such as:

*   **Solubility:** The ability of water to dissolve other substances (often described by "like dissolves like").
*   **Hydration Shells:** The arrangement of water molecules surrounding an ion or molecule.

**Could you please provide a little more context?** Knowing whether you are asking about chemistry, biology, or a specific concept will help me give you the exact formula you are looking for!

مثال چند نوبتی با حذف افکار

مدیریت صحیح افکار تولید شده توسط مدل برای حفظ عملکرد در مکالمات چند نوبتی بسیار مهم است.

  • مکالمات چند نوبتی استاندارد: شما باید افکار تولید شده مدل را از نوبت قبلی قبل از ارسال تاریخچه مکالمه به مدل برای نوبت بعدی حذف کنید (از بین ببرید). اگر می‌خواهید حالت تفکر را در اواسط مکالمه غیرفعال کنید، می‌توانید هنگام حذف افکار قبلی، توکن <|think|> را حذف کنید.
  • فراخوانی تابع (استثنا): اگر یک نوبت مدل شامل فراخوانی تابع یا ابزار باشد، نباید بین فراخوانی‌های تابع، ایده‌ها حذف شوند.
  • نگهداری تاریخچه مکالمات: خروجی مدل تاریخچه فقط باید شامل پاسخ نهایی باشد. اطمینان حاصل کنید که هیچ فکر تولید شده از نوبت‌های قبلی قبل از شروع نوبت بعدی کاربر در پنجره زمینه باقی نمانده باشد.
# Append the clean response to the message history
message.append({"role": "assistant", "content": result["content"]})

# ==========================================
# TURN 2
# ==========================================
print("\n--- Turn 2 ---")
# Add the next user query to the history
message.append({
    "role": "user",
    "content": "What is its boiling point in Celsius?"
})

text = processor.apply_chat_template(message, tokenize=False, add_generation_prompt=True, enable_thinking=True)
inputs = processor(text=text, return_tensors="pt").to(model.device)
input_len = inputs["input_ids"].shape[-1]

streamer = TextStreamer(processor)
outputs = model.generate(**inputs, streamer=streamer, max_new_tokens=1024)
--- Turn 2 ---
<bos><|turn>system
<|think|>
<turn|>
<|turn>user
What is the water formula?<turn|>
<|turn>model
The term "water formula" can refer to a few different things, depending on the context. Here are the most common interpretations:

### 1. Chemical Formula (The most likely answer)

If you are referring to the **chemical structure** of water (\\(\text{H}_2\text{O}\\)), the formula is:


$$\text{H}_2\text{O}$$


**What this means:**

*   **H** stands for Hydrogen.
*   **O** stands for Oxygen.
*   The formula indicates that one molecule of water is composed of **two hydrogen atoms** chemically bonded to **one oxygen atom**.

***

### 2. The Water Cycle Formula (Scientific Process)

If you are referring to the **water cycle**, which describes the continuous movement of water on, above, and below the surface of the Earth, there isn't a single mathematical formula, but rather a series of physical processes:

*   **Evaporation:** Liquid water turns into water vapor (gas) due to heat.
*   **Condensation:** Water vapor cools and turns back into liquid water, forming clouds.
*   **Precipitation:** Water falls back to Earth in the form of rain, snow, sleet, or hail.
*   **Collection/Runoff:** Water gathers in rivers, lakes, oceans, or soaks into the ground.

***

### 3. Hydration Formula (Biology/Chemistry)

In a biological or chemical context, the "formula" might refer to how water interacts with other substances, such as:

*   **Solubility:** The ability of water to dissolve other substances (often described by "like dissolves like").
*   **Hydration Shells:** The arrangement of water molecules surrounding an ion or molecule.

**Could you please provide a little more context?** Knowing whether you are asking about chemistry, biology, or a specific concept will help me give you the exact formula you are looking for!<turn|>
<|turn>user
What is its boiling point in Celsius?<turn|>
<|turn>model
<|channel>thought
Thinking Process:

1.  **Identify the core question:** The user is asking for the boiling point of water in Celsius.
2.  **Recall/Verify the standard scientific fact:** The standard boiling point of pure water at standard atmospheric pressure is a well-known physical constant.
3.  **State the answer clearly:** The boiling point of water is \\(100^\circ\text{C}\\).
4.  **Add necessary context/caveats (Crucial step for accuracy):** Boiling point is dependent on pressure. I must mention this to provide a complete and scientifically accurate answer.
    *   What happens at different altitudes (lower pressure)? (Boiling point drops).
    *   What happens under high pressure? (Boiling point increases).
5.  **Final Review:** The answer is direct, accurate, and provides essential context. (Self-correction: Ensure the units are explicitly stated as Celsius.)<channel|>The boiling point of pure water at standard atmospheric pressure (sea level) is:

$$\mathbf{100^\circ\text{C} }$$ (Degrees Celsius)

**Important Note:**

The boiling point of water is dependent on **pressure**.

*   **Higher Pressure:** If the pressure above the water increases (e.g., in a pressure cooker), the boiling point will be **higher** than \\(100^\circ\text{C}\\).
*   **Lower Pressure:** If the pressure decreases (e.g., at high altitudes), the boiling point will be **lower** than \\(100^\circ\text{C}\\).<turn|>
response = processor.decode(outputs[0][input_len:], skip_special_tokens=False)
result = processor.parse_response(response)

for key, value in result.items():
  if key == "role":
    print(f"Role: {value}")
  elif key == "thinking":
    print(f"\n=== Thoughts ===\n{value}")
  elif key == "content":
    print(f"\n=== Answer ===\n{value}")
  elif key == "tool_calls":
    print(f"\n=== Tool Calls ===\n{value}")
  else:
    print(f"\n{key}: {value}...\n")
Role: assistant

=== Thoughts ===
Thinking Process:

1.  **Identify the core question:** The user is asking for the boiling point of water in Celsius.
2.  **Recall/Verify the standard scientific fact:** The standard boiling point of pure water at standard atmospheric pressure is a well-known physical constant.
3.  **State the answer clearly:** The boiling point of water is \\(100^\circ\text{C}\\).
4.  **Add necessary context/caveats (Crucial step for accuracy):** Boiling point is dependent on pressure. I must mention this to provide a complete and scientifically accurate answer.
    *   What happens at different altitudes (lower pressure)? (Boiling point drops).
    *   What happens under high pressure? (Boiling point increases).
5.  **Final Review:** The answer is direct, accurate, and provides essential context. (Self-correction: Ensure the units are explicitly stated as Celsius.)

=== Answer ===
The boiling point of pure water at standard atmospheric pressure (sea level) is:

$$\mathbf{100^\circ\text{C} }$$ (Degrees Celsius)

**Important Note:**

The boiling point of water is dependent on **pressure**.

*   **Higher Pressure:** If the pressure above the water increases (e.g., in a pressure cooker), the boiling point will be **higher** than \\(100^\circ\text{C}\\).
*   **Lower Pressure:** If the pressure decreases (e.g., at high altitudes), the boiling point will be **lower** than \\(100^\circ\text{C}\\).

استنتاج تک تصویری

روش استفاده از مدل تفکر با داده‌های بصری بسیار مشابه است. می‌توانید یک تصویر را به عنوان بخشی از آرایه messages ارائه دهید. فقط مطمئن شوید که تصویر را به همراه متن قالب‌بندی شده به پردازنده ارسال می‌کنید و مدل قبل از پاسخ دادن، ورودی بصری را استدلال می‌کند.

from PIL import Image
import matplotlib.pyplot as plt

prompt = "What is shown in this image?"
image_url = "https://raw.githubusercontent.com/google-gemma/cookbook/refs/heads/main/apps/sample-data/GoldenGate.png"

# download image
!wget -q {image_url} -O image.png
image = Image.open("image.png")

# Display all images
print("=== Downloaded image ===")
fig, ax = plt.subplots(1, 1, figsize=(5, 5))
ax.imshow(image)
ax.set_title("Image 1")
ax.axis("off")
plt.tight_layout()
plt.show()

message = [
    {
        "role": "user", "content": [
          {"type": "image"},
          {"type": "text", "text": prompt}
        ]
    }
]

text = processor.apply_chat_template(message, tokenize=False, add_generation_prompt=True, enable_thinking=True)
inputs = processor(text=text, images=image, return_tensors="pt").to(model.device)
input_len = inputs["input_ids"].shape[-1]

outputs = model.generate(**inputs, max_new_tokens=1024)
response = processor.decode(outputs[0][input_len:], skip_special_tokens=False)

result = processor.parse_response(response)

for key, value in result.items():
  if key == "role":
    print(f"Role: {value}")
  elif key == "thinking":
    print(f"\n=== Thoughts ===\n{value}")
  elif key == "content":
    print(f"\n=== Answer ===\n{value}")
  elif key == "tool_calls":
    print(f"\n=== Tool Calls ===\n{value}")
  else:
    print(f"\n{key}: {value}...\n")
=== Downloaded image ===

png

Role: assistant

=== Thoughts ===
Here's a thinking process to arrive at the suggested answer:

1.  **Analyze the Image:**
    *   **Dominant Structure:** The most prominent feature is a massive suspension bridge with tall red towers (pylons). This is immediately recognizable as the Golden Gate Bridge.
    *   **Foreground/Midground:** There is water (looks like the San Francisco Bay/Pacific Ocean). There is a breakwater/rocky shoreline in the immediate foreground. A small rock formation is visible in the water.
    *   **Background/Landmass:** Hills/mountains are visible in the distance, behind the bridge structure.
    *   **Other Elements:** There is a low-lying building/structure on the left (part of the bridge infrastructure or a related facility). The sky is clear and blue.

2.  **Identify Key Subject:** The central subject is the Golden Gate Bridge.

3.  **Determine Setting/Location (Contextualization):** Since it's the Golden Gate Bridge, the location is San Francisco, California.

4.  **Synthesize the Description (Drafting the Answer):** Start with the main subject and then add details about the setting.

    *   *Initial thought:* It's a picture of the Golden Gate Bridge over the water.
    *   *Refinement (Adding detail):* It shows the iconic red suspension bridge spanning a body of water, with land/hills in the background.

5.  **Review the Provided Text (Self-Correction/Verification):** The prompt includes some garbled text ("ołat" at the end), but the visual evidence is overwhelming. The task is simply to describe the image.

6.  **Final Output Generation:** (This leads to the structured, informative response.) (The final response should clearly identify the landmark.)

=== Answer ===
The image shows the **Golden Gate Bridge** in San Francisco, California.

Key elements visible in the picture are:

* **The Golden Gate Bridge:** The iconic red suspension bridge dominates the frame, spanning a large body of water.
* **Water:** The foreground features dark blue water, likely the Pacific Ocean or the San Francisco Bay.
* **Landmass/Hills:** Hills and mountains are visible in the background behind the bridge.
* **Foreground Detail:** There is a rocky shoreline and a small rock formation in the water in the immediate foreground.

In summary, it is a scenic view of the famous Golden Gate Bridge.

خلاصه و مراحل بعدی

در این راهنما، شما یاد گرفتید که چگونه از قابلیت‌های تفکر مدل‌های Gemma 4 برای تولید فرآیندهای استدلال قبل از پاسخ‌های نهایی استفاده کنید. شما موارد زیر را پوشش دادید:

  • فعال کردن حالت تفکر با استفاده از enable_thinking=True در apply_chat_template .
  • استفاده از TextStreamer برای مشاهده فرآیند تفکر در زمان واقعی.
  • تجزیه خروجی ترکیبی به بلوک‌های thinking و answer جداگانه با استفاده از parse_response .
  • به کارگیری قابلیت‌های تفکر در وظایف چندوجهی (تصویر + متن).

مراحل بعدی

قابلیت‌های بیشتر Gemma 4 را بررسی کنید: