|
|
Jalankan di Google Colab
|
|
|
Lihat sumber di GitHub
|
Saat menggunakan model kecerdasan buatan (AI) generatif seperti Gemma, Anda mungkin ingin menggunakan model tersebut untuk mengoperasikan antarmuka pemrograman guna menyelesaikan tugas atau menjawab pertanyaan. Memberi petunjuk pada model dengan menentukan antarmuka pemrograman, lalu membuat permintaan yang menggunakan antarmuka tersebut disebut panggilan fungsi.
Panduan ini menunjukkan proses penggunaan Gemma 4 dalam ekosistem Hugging Face.
Notebook ini akan berjalan di GPU T4.
Menginstal paket Python
Instal library Hugging Face yang diperlukan untuk menjalankan model Gemma dan membuat permintaan.
# Install PyTorch & other librariespip install torch accelerate# Install the transformers librarypip install transformers
Memuat Model
Gunakan library transformers untuk membuat instance processor dan model menggunakan class AutoProcessor dan AutoModelForImageTextToText seperti yang ditunjukkan dalam contoh kode berikut:
MODEL_ID = "google/gemma-4-E2B-it" # @param ["google/gemma-4-E2B-it","google/gemma-4-E4B-it", "google/gemma-4-31B-it", "google/gemma-4-26B-A4B-it"]
from transformers import AutoProcessor, AutoModelForMultimodalLM
model = AutoModelForMultimodalLM.from_pretrained(MODEL_ID, dtype="auto", device_map="auto")
processor = AutoProcessor.from_pretrained(MODEL_ID)
Loading weights: 0%| | 0/2011 [00:00<?, ?it/s]
Alat Lulus Ujian
Anda dapat meneruskan alat ke model menggunakan fungsi apply_chat_template() melalui argumen tools. Ada dua metode untuk menentukan alat ini:
- Skema JSON: Anda dapat membuat kamus JSON secara manual yang menentukan nama, deskripsi, dan parameter fungsi (termasuk jenis dan kolom wajib diisi).
- Fungsi Python Mentah: Anda dapat meneruskan fungsi Python sebenarnya. Sistem secara otomatis membuat skema JSON yang diperlukan dengan mengurai petunjuk jenis, argumen, dan docstring fungsi. Untuk hasil terbaik, docstring harus mematuhi Panduan Gaya Python Google.
Berikut adalah contoh dengan skema JSON.
from transformers import TextStreamer
weather_function_schema = {
"type": "function",
"function": {
"name": "get_current_temperature",
"description": "Gets the current temperature for a given location.",
"parameters": {
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "The city name, e.g. San Francisco",
},
},
"required": ["location"],
},
}
}
message = [
{
"role": "system", "content": "You are a helpful assistant."
},
{
"role": "user", "content": "What's the temperature in London?"
}
]
text = processor.apply_chat_template(message, tools=[weather_function_schema], tokenize=False, add_generation_prompt=True)
inputs = processor(text=text, return_tensors="pt").to(model.device)
streamer = TextStreamer(processor)
outputs = model.generate(**inputs, streamer=streamer, max_new_tokens=64)
<bos><|turn>system
You are a helpful assistant.<|tool>declaration:get_current_temperature{description:<|"|>Gets the current temperature for a given location.<|"|>,parameters:{properties:{location:{description:<|"|>The city name, e.g. San Francisco<|"|>,type:<|"|>STRING<|"|>} },required:[<|"|>location<|"|>],type:<|"|>OBJECT<|"|>} }<tool|><turn|>
<|turn>user
What's the temperature in London?<turn|>
<|turn>model
<|tool_call>call:get_current_temperature{location:<|"|>London<|"|>}<tool_call|><|tool_response>
Dan contoh yang sama dengan fungsi Python mentah.
from transformers.utils import get_json_schema
def get_current_temperature(location: str):
"""
Gets the current temperature for a given location.
Args:
location: The city name, e.g. San Francisco
"""
return "15°C"
message = [
{
"role": "user", "content": "What's the temperature in London?"
}
]
text = processor.apply_chat_template(message, tools=[get_json_schema(get_current_temperature)], tokenize=False, add_generation_prompt=True)
inputs = processor(text=text, return_tensors="pt").to(model.device)
streamer = TextStreamer(processor)
outputs = model.generate(**inputs, streamer=streamer, max_new_tokens=256)
<bos><|turn>system
<|tool>declaration:get_current_temperature{description:<|"|>Gets the current temperature for a given location.<|"|>,parameters:{properties:{location:{description:<|"|>The city name, e.g. San Francisco<|"|>,type:<|"|>STRING<|"|>} },required:[<|"|>location<|"|>],type:<|"|>OBJECT<|"|>} }<tool|><turn|>
<|turn>user
What's the temperature in London?<turn|>
<|turn>model
<|tool_call>call:get_current_temperature{location:<|"|>London<|"|>}<tool_call|><|tool_response>
Urutan lengkap panggilan fungsi
Bagian ini menunjukkan siklus tiga tahap untuk menghubungkan model ke alat eksternal: Giliran Model untuk membuat objek panggilan fungsi, Giliran Developer untuk mengurai dan mengeksekusi kode (seperti API cuaca), dan Respons Akhir saat model menggunakan output alat untuk menjawab pengguna.
Giliran Model
Berikut perintah pengguna "Hey, what's the weather in Tokyo right now?", dan alat [get_current_weather]. Gemma menghasilkan objek panggilan fungsi sebagai berikut.
# Define a function that our model can use.
def get_current_weather(location: str, unit: str = "celsius"):
"""
Gets the current weather in a given location.
Args:
location: The city and state, e.g. "San Francisco, CA" or "Tokyo, JP"
unit: The unit to return the temperature in. (choices: ["celsius", "fahrenheit"])
Returns:
temperature: The current temperature in the given location
weather: The current weather in the given location
"""
return {"temperature": 15, "weather": "sunny"}
prompt = "Hey, what's the weather in Tokyo right now?"
tools = [get_current_weather]
message = [
{
"role": "system", "content": "You are a helpful assistant."
},
{
"role": "user", "content": prompt
},
]
text = processor.apply_chat_template(message, tools=tools, tokenize=False, add_generation_prompt=True)
inputs = processor(text=text, return_tensors="pt").to(model.device)
out = model.generate(**inputs, max_new_tokens=128)
generated_tokens = out[0][len(inputs["input_ids"][0]):]
output = processor.decode(generated_tokens, skip_special_tokens=False)
print(f"Prompt: {prompt}")
print(f"Tools: {tools}")
print(f"Output: {output}")
Prompt: Hey, what's the weather in Tokyo right now?
Tools: [<function get_current_weather at 0x7cef824ece00>]
Output: <|tool_call>call:get_current_weather{location:<|"|>Tokyo, JP<|"|>}<tool_call|><|tool_response>
Developer's Turn
Aplikasi Anda harus mem-parsing respons model untuk mengekstrak nama dan argumen fungsi, serta menambahkan tool_calls dan tool_responses dengan peran assistant.
import re
import json
def extract_tool_calls(text):
def cast(v):
try: return int(v)
except:
try: return float(v)
except: return {'true': True, 'false': False}.get(v.lower(), v.strip("'\""))
return [{
"name": name,
"arguments": {
k: cast((v1 or v2).strip())
for k, v1, v2 in re.findall(r'(\w+):(?:<\|"\|>(.*?)<\|"\|>|([^,}]*))', args)
}
} for name, args in re.findall(r"<\|tool_call>call:(\w+)\{(.*?)\}<tool_call\|>", text, re.DOTALL)]
calls = extract_tool_calls(output)
if calls:
# Call the function and get the result
#####################################
# WARNING: This is a demonstration. #
#####################################
# Using globals() to call functions dynamically can be dangerous in
# production. In a real application, you should implement a secure way to
# map function names to actual function calls, such as a predefined
# dictionary of allowed tools and their implementations.
results = [
{"name": c['name'], "response": globals()[c['name']](**c['arguments'])}
for c in calls
]
message.append({
"role": "assistant",
"tool_calls": [
{"function": call} for call in calls
],
"tool_responses": results
})
print(json.dumps(message[-1], indent=2))
{
"role": "assistant",
"tool_calls": [
{
"function": {
"name": "get_current_weather",
"arguments": {
"location": "Tokyo, JP"
}
}
}
],
"tool_responses": [
{
"name": "get_current_weather",
"response": {
"temperature": 15,
"weather": "sunny"
}
}
]
}
"tool_responses": [
{
"name": function_name,
"response": function_response
}
]
Jika ada beberapa permintaan independen:
"tool_responses": [
{
"name": function_name_1,
"response": function_response_1
},
{
"name": function_name_2,
"response": function_response_2
}
]
Respons Akhir
Terakhir, Gemma membaca respons alat dan membalas pengguna.
text = processor.apply_chat_template(message, tools=tools, tokenize=False, add_generation_prompt=True)
inputs = processor(text=text, return_tensors="pt").to(model.device)
out = model.generate(**inputs, max_new_tokens=128)
generated_tokens = out[0][len(inputs["input_ids"][0]):]
output = processor.decode(generated_tokens, skip_special_tokens=True)
print(f"Output: {output}")
message[-1]["content"] = output
Output: The current weather in Tokyo is 15 degrees and sunny.
Anda dapat melihat histori percakapan lengkap di bawah.
# full history
print(json.dumps(message, indent=2))
print("-"*80)
output = processor.decode(out[0], skip_special_tokens=False)
print(f"Output: {output}")
[
{
"role": "system",
"content": "You are a helpful assistant."
},
{
"role": "user",
"content": "Hey, what's the weather in Tokyo right now?"
},
{
"role": "assistant",
"tool_calls": [
{
"function": {
"name": "get_current_weather",
"arguments": {
"location": "Tokyo, JP"
}
}
}
],
"tool_responses": [
{
"name": "get_current_weather",
"response": {
"temperature": 15,
"weather": "sunny"
}
}
],
"content": "The current weather in Tokyo is 15 degrees and sunny."
}
]
--------------------------------------------------------------------------------
Output: <bos><|turn>system
You are a helpful assistant.<|tool>declaration:get_current_weather{description:<|"|>Gets the current weather in a given location.<|"|>,parameters:{properties:{location:{description:<|"|>The city and state, e.g. "San Francisco, CA" or "Tokyo, JP"<|"|>,type:<|"|>STRING<|"|>},unit:{description:<|"|>The unit to return the temperature in.<|"|>,enum:[<|"|>celsius<|"|>,<|"|>fahrenheit<|"|>],type:<|"|>STRING<|"|>} },required:[<|"|>location<|"|>],type:<|"|>OBJECT<|"|>} }<tool|><turn|>
<|turn>user
Hey, what's the weather in Tokyo right now?<turn|>
<|turn>model
<|tool_call>call:get_current_weather{location:<|"|>Tokyo, JP<|"|>}<tool_call|><|tool_response>response:get_current_weather{temperature:15,weather:<|"|>sunny<|"|>}<tool_response|>The current weather in Tokyo is 15 degrees and sunny.<turn|>
Panggilan fungsi dengan Penalaran
Dengan memanfaatkan proses penalaran internal, model secara signifikan meningkatkan akurasi pemanggilan fungsinya. Hal ini memungkinkan pengambilan keputusan yang lebih tepat mengenai kapan harus memicu alat dan cara menentukan parameternya.
prompt = "Hey, I'm in Seoul. Is it good for running now?"
message = [
{
"role": "system", "content": "You are a helpful assistant."
},
{
"role": "user", "content": prompt
},
]
text = processor.apply_chat_template(message, tools=tools, tokenize=False, add_generation_prompt=True, enable_thinking=True)
inputs = processor(text=text, return_tensors="pt").to(model.device)
input_len = inputs["input_ids"].shape[-1]
out = model.generate(**inputs, max_new_tokens=1024)
output = processor.decode(out[0][input_len:], skip_special_tokens=False)
result = processor.parse_response(output)
for key, value in result.items():
if key == "role":
print(f"Role: {value}")
elif key == "thinking":
print(f"\n=== Thoughts ===\n{value}")
elif key == "content":
print(f"\n=== Answer ===\n{value}")
elif key == "tool_calls":
print(f"\n=== Tool Calls ===\n{value}")
else:
print(f"\n{key}: {value}...\n")
Role: assistant
=== Thoughts ===
1. **Analyze the Request:** The user is asking if it's "good for running now" in "Seoul".
2. **Identify Necessary Information:** To determine if it's good for running, I need current weather information (temperature, precipitation, etc.) for Seoul.
3. **Examine Available Tools:** The available tool is `get_current_weather(location, unit)`.
4. **Determine Tool Arguments:**
* `location`: The user specified "Seoul".
* `unit`: The user did not specify a unit (Celsius or Fahrenheit).
5. **Formulate the Tool Call:** I need to call `get_current_weather` with the location. Since the user didn't specify a unit, I can either omit it (if the tool defaults are acceptable) or choose a common one. However, the tool definition requires `location` but `unit` is optional.
6. **Construct the Response Strategy:**
* Call the tool to get the weather data for Seoul.
* Once the data is received, I can advise the user on whether it's suitable for running.
7. **Generate Tool Call:**
```json
{
"toolSpec": {
"name": "get_current_weather",
"args": {
"location": "Seoul"
}
}
}
```
(Self-correction: The `unit` parameter is optional in the definition, so just providing the location is sufficient to proceed.)
8. **Final Output Generation:** Present the tool call to the user/system.
=== Tool Calls ===
[{'type': 'function', 'function': {'name': 'get_current_weather', 'arguments': {'location': 'Seoul'} } }]
Memproses panggilan alat dan mendapatkan jawaban akhir.
calls = extract_tool_calls(output)
if calls:
# Call the function and get the result
#####################################
# WARNING: This is a demonstration. #
#####################################
# Using globals() to call functions dynamically can be dangerous in
# production. In a real application, you should implement a secure way to
# map function names to actual function calls, such as a predefined
# dictionary of allowed tools and their implementations.
results = [
{"name": c['name'], "response": globals()[c['name']](**c['arguments'])}
for c in calls
]
message.append({
"role": "assistant",
"tool_calls": [
{"function": call} for call in calls
],
"tool_responses": results
})
text = processor.apply_chat_template(message, tools=tools, tokenize=False, add_generation_prompt=True)
inputs = processor(text=text, return_tensors="pt").to(model.device)
out = model.generate(**inputs, max_new_tokens=128)
generated_tokens = out[0][len(inputs["input_ids"][0]):]
output = processor.decode(generated_tokens, skip_special_tokens=True)
print(f"Output: {output}")
message[-1]["content"] = output
print("-"*80)
print("Full History")
print("-"*80)
print(json.dumps(message, indent=2))
Output: The current weather in Seoul is 15 degrees Celsius and sunny. That sounds like great weather for a run!
--------------------------------------------------------------------------------
Full History
--------------------------------------------------------------------------------
[
{
"role": "system",
"content": "You are a helpful assistant."
},
{
"role": "user",
"content": "Hey, I'm in Seoul. Is it good for running now?"
},
{
"role": "assistant",
"tool_calls": [
{
"function": {
"name": "get_current_weather",
"arguments": {
"location": "Seoul"
}
}
}
],
"tool_responses": [
{
"name": "get_current_weather",
"response": {
"temperature": 15,
"weather": "sunny"
}
}
],
"content": "The current weather in Seoul is 15 degrees Celsius and sunny. That sounds like great weather for a run!"
}
]
Peringatan Penting: Skema Otomatis vs. Manual
Saat mengandalkan konversi otomatis dari fungsi Python ke skema JSON, output yang dihasilkan mungkin tidak selalu memenuhi ekspektasi tertentu terkait parameter kompleks.
Jika fungsi menggunakan objek kustom (seperti class Config) sebagai argumen, pengonversi otomatis dapat mendeskripsikannya hanya sebagai "objek" generik tanpa menjelaskan properti internalnya.
Dalam kasus ini, lebih baik tentukan skema JSON secara manual untuk memastikan properti bertingkat (seperti tema atau font_size dalam objek config) ditentukan secara eksplisit untuk model.
import json
from transformers.utils import get_json_schema
class Config:
def __init__(self):
self.theme = "light"
self.font_size = 14
def update_config(config: Config):
"""
Updates the configuration of the system.
Args:
config: A Config object
Returns:
True if the configuration was successfully updated, False otherwise.
"""
update_config_schema = {
"type": "function",
"function": {
"name": "update_config",
"description": "Updates the configuration of the system.",
"parameters": {
"type": "object",
"properties": {
"config": {
"type": "object",
"description": "A Config object",
"properties": {"theme": {"type": "string"}, "font_size": {"type": "number"} },
},
},
"required": ["config"],
},
},
}
print(f"--- [Automatic] ---")
print(json.dumps(get_json_schema(update_config), indent=2))
print(f"\n--- [Manual Schemas] ---")
print(json.dumps(update_config_schema, indent=2))
--- [Automatic] ---
{
"type": "function",
"function": {
"name": "update_config",
"description": "Updates the configuration of the system.",
"parameters": {
"type": "object",
"properties": {
"config": {
"type": "object",
"description": "A Config object"
}
},
"required": [
"config"
]
}
}
}
--- [Manual Schemas] ---
{
"type": "function",
"function": {
"name": "update_config",
"description": "Updates the configuration of the system.",
"parameters": {
"type": "object",
"properties": {
"config": {
"type": "object",
"description": "A Config object",
"properties": {
"theme": {
"type": "string"
},
"font_size": {
"type": "number"
}
}
}
},
"required": [
"config"
]
}
}
}
Ringkasan dan langkah berikutnya
Anda telah menetapkan cara membangun aplikasi yang dapat memanggil fungsi dengan Gemma 4. Alur kerja ditetapkan melalui siklus empat tahap:
- Tentukan Alat: Buat fungsi yang dapat digunakan model Anda, dengan menentukan argumen dan deskripsi (misalnya, fungsi pencarian cuaca).
- Giliran Model: Model menerima perintah pengguna dan daftar alat yang tersedia, lalu menampilkan objek panggilan fungsi terstruktur, bukan teks biasa.
- Giliran Developer: Developer mengurai output ini menggunakan ekspresi reguler untuk mengekstrak nama dan argumen fungsi, menjalankan kode Python yang sebenarnya, dan menambahkan hasilnya ke histori chat menggunakan peran alat tertentu.
- Respons Akhir: Model memproses hasil eksekusi alat untuk menghasilkan jawaban akhir dalam bahasa alami bagi pengguna.
Lihat dokumentasi berikut untuk membaca lebih lanjut.
Jalankan di Google Colab
Lihat sumber di GitHub