使用 Gemma 4 进行函数调用

在 ai.google.dev 上查看 在 Google Colab 中运行 在 Kaggle 中运行 在 Vertex AI 中打开 在 GitHub 上查看源代码

使用 Gemma 等生成式人工智能 (AI) 模型时,您可能希望使用该模型来操作编程接口,以便完成任务或回答问题。通过定义编程接口来指示模型,然后发出使用该接口的请求,这种做法称为函数调用

本指南介绍了在 Hugging Face 生态系统中如何使用 Gemma 4。

此笔记本将在 T4 GPU 上运行。

安装 Python 软件包

安装运行 Gemma 模型和发出请求所需的 Hugging Face 库。

# Install PyTorch & other libraries
pip install torch accelerate

# Install the transformers library
pip install transformers

加载模型

使用 transformers 库,通过 AutoProcessorAutoModelForImageTextToText 类创建 processormodel 的实例,如以下代码示例所示:

MODEL_ID = "google/gemma-4-E2B-it" # @param ["google/gemma-4-E2B-it","google/gemma-4-E4B-it", "google/gemma-4-31B-it", "google/gemma-4-26B-A4B-it"]

from transformers import AutoProcessor, AutoModelForMultimodalLM

model = AutoModelForMultimodalLM.from_pretrained(MODEL_ID, dtype="auto", device_map="auto")
processor = AutoProcessor.from_pretrained(MODEL_ID)
Loading weights:   0%|          | 0/2011 [00:00<?, ?it/s]

传递工具

您可以使用 apply_chat_template() 函数通过 tools 实参将工具传递给模型。您可以通过两种方法定义这些工具:

  • JSON 架构:您可以手动构建一个 JSON 字典,用于定义函数名称、说明和参数(包括类型和必需字段)。
  • 原始 Python 函数:您可以传递实际的 Python 函数。系统会通过解析函数的类型提示、实参和文档字符串来自动生成所需的 JSON 架构。为获得最佳效果,文档字符串应遵循 Google Python 样式指南

以下是包含 JSON 架构的示例。

from transformers import TextStreamer

weather_function_schema = {
    "type": "function",
    "function": {
        "name": "get_current_temperature",
        "description": "Gets the current temperature for a given location.",
        "parameters": {
            "type": "object",
            "properties": {
                "location": {
                    "type": "string",
                    "description": "The city name, e.g. San Francisco",
                },
            },
            "required": ["location"],
        },
    }
}

message = [
    {
        "role": "system", "content": "You are a helpful assistant."
    },
    {
        "role": "user", "content": "What's the temperature in London?"
    }
]

text = processor.apply_chat_template(message, tools=[weather_function_schema], tokenize=False, add_generation_prompt=True)
inputs = processor(text=text, return_tensors="pt").to(model.device)
streamer = TextStreamer(processor)
outputs = model.generate(**inputs, streamer=streamer, max_new_tokens=64)
<bos><|turn>system
You are a helpful assistant.<|tool>declaration:get_current_temperature{description:<|"|>Gets the current temperature for a given location.<|"|>,parameters:{properties:{location:{description:<|"|>The city name, e.g. San Francisco<|"|>,type:<|"|>STRING<|"|>} },required:[<|"|>location<|"|>],type:<|"|>OBJECT<|"|>} }<tool|><turn|>
<|turn>user
What's the temperature in London?<turn|>
<|turn>model
<|tool_call>call:get_current_temperature{location:<|"|>London<|"|>}<tool_call|><|tool_response>

下面是使用原始 Python 函数的相同示例。

from transformers.utils import get_json_schema

def get_current_temperature(location: str):
    """
    Gets the current temperature for a given location.

    Args:
        location: The city name, e.g. San Francisco
    """
    return "15°C"

message = [
    {
        "role": "user", "content": "What's the temperature in London?"
    }
]

text = processor.apply_chat_template(message, tools=[get_json_schema(get_current_temperature)], tokenize=False, add_generation_prompt=True)
inputs = processor(text=text, return_tensors="pt").to(model.device)
streamer = TextStreamer(processor)
outputs = model.generate(**inputs, streamer=streamer, max_new_tokens=256)
<bos><|turn>system
<|tool>declaration:get_current_temperature{description:<|"|>Gets the current temperature for a given location.<|"|>,parameters:{properties:{location:{description:<|"|>The city name, e.g. San Francisco<|"|>,type:<|"|>STRING<|"|>} },required:[<|"|>location<|"|>],type:<|"|>OBJECT<|"|>} }<tool|><turn|>
<|turn>user
What's the temperature in London?<turn|>
<|turn>model
<|tool_call>call:get_current_temperature{location:<|"|>London<|"|>}<tool_call|><|tool_response>

完整的函数调用序列

本部分演示了将模型连接到外部工具的三阶段周期:模型轮次(生成函数调用对象)、开发者轮次(解析和执行代码,例如天气 API)以及最终回答(模型使用工具的输出回答用户)。

模型的回合

以下是用户提示 "Hey, what's the weather in Tokyo right now?" 和工具 [get_current_weather]。Gemma 生成的函数调用对象如下所示。

# Define a function that our model can use.
def get_current_weather(location: str, unit: str = "celsius"):
    """
    Gets the current weather in a given location.

    Args:
        location: The city and state, e.g. "San Francisco, CA" or "Tokyo, JP"
        unit: The unit to return the temperature in. (choices: ["celsius", "fahrenheit"])

    Returns:
        temperature: The current temperature in the given location
        weather: The current weather in the given location
    """
    return {"temperature": 15, "weather": "sunny"}

prompt = "Hey, what's the weather in Tokyo right now?"
tools = [get_current_weather]

message = [
    {
        "role": "system", "content": "You are a helpful assistant."
    },
    {
        "role": "user", "content": prompt
    },
]

text = processor.apply_chat_template(message, tools=tools, tokenize=False, add_generation_prompt=True)
inputs = processor(text=text, return_tensors="pt").to(model.device)
out = model.generate(**inputs, max_new_tokens=128)
generated_tokens = out[0][len(inputs["input_ids"][0]):]
output = processor.decode(generated_tokens, skip_special_tokens=False)

print(f"Prompt: {prompt}")
print(f"Tools: {tools}")
print(f"Output: {output}")
Prompt: Hey, what's the weather in Tokyo right now?
Tools: [<function get_current_weather at 0x7cef824ece00>]
Output: <|tool_call>call:get_current_weather{location:<|"|>Tokyo, JP<|"|>}<tool_call|><|tool_response>

开发者回合

您的应用应解析模型的响应以提取函数名称和实参,并附加具有 assistant 角色的 tool_callstool_responses

import re
import json

def extract_tool_calls(text):
    def cast(v):
        try: return int(v)
        except:
            try: return float(v)
            except: return {'true': True, 'false': False}.get(v.lower(), v.strip("'\""))

    return [{
        "name": name,
        "arguments": {
            k: cast((v1 or v2).strip())
            for k, v1, v2 in re.findall(r'(\w+):(?:<\|"\|>(.*?)<\|"\|>|([^,}]*))', args)
        }
    } for name, args in re.findall(r"<\|tool_call>call:(\w+)\{(.*?)\}<tool_call\|>", text, re.DOTALL)]

calls = extract_tool_calls(output)
if calls:
    # Call the function and get the result
    #####################################
    # WARNING: This is a demonstration. #
    #####################################
    # Using globals() to call functions dynamically can be dangerous in
    # production. In a real application, you should implement a secure way to
    # map function names to actual function calls, such as a predefined
    # dictionary of allowed tools and their implementations.
    results = [
        {"name": c['name'], "response": globals()[c['name']](**c['arguments'])}
        for c in calls
    ]

    message.append({
        "role": "assistant",
        "tool_calls": [
            {"function": call} for call in calls
        ],
        "tool_responses": results
    })
    print(json.dumps(message[-1], indent=2))
{
  "role": "assistant",
  "tool_calls": [
    {
      "function": {
        "name": "get_current_weather",
        "arguments": {
          "location": "Tokyo, JP"
        }
      }
    }
  ],
  "tool_responses": [
    {
      "name": "get_current_weather",
      "response": {
        "temperature": 15,
        "weather": "sunny"
      }
    }
  ]
}
"tool_responses": [
  {
    "name": function_name,
    "response": function_response
  }
]

如果存在多个独立请求:

"tool_responses": [
  {
    "name": function_name_1,
    "response": function_response_1
  },
  {
    "name": function_name_2,
    "response": function_response_2
  }
]

最终回答

最后,Gemma 读取工具响应并回复用户。

text = processor.apply_chat_template(message, tools=tools, tokenize=False, add_generation_prompt=True)
inputs = processor(text=text, return_tensors="pt").to(model.device)
out = model.generate(**inputs, max_new_tokens=128)
generated_tokens = out[0][len(inputs["input_ids"][0]):]
output = processor.decode(generated_tokens, skip_special_tokens=True)
print(f"Output: {output}")
message[-1]["content"] = output
Output: The current weather in Tokyo is 15 degrees and sunny.

您可以在下方查看完整的聊天记录。

# full history
print(json.dumps(message, indent=2))

print("-"*80)
output = processor.decode(out[0], skip_special_tokens=False)
print(f"Output: {output}")
[
  {
    "role": "system",
    "content": "You are a helpful assistant."
  },
  {
    "role": "user",
    "content": "Hey, what's the weather in Tokyo right now?"
  },
  {
    "role": "assistant",
    "tool_calls": [
      {
        "function": {
          "name": "get_current_weather",
          "arguments": {
            "location": "Tokyo, JP"
          }
        }
      }
    ],
    "tool_responses": [
      {
        "name": "get_current_weather",
        "response": {
          "temperature": 15,
          "weather": "sunny"
        }
      }
    ],
    "content": "The current weather in Tokyo is 15 degrees and sunny."
  }
]
--------------------------------------------------------------------------------
Output: <bos><|turn>system
You are a helpful assistant.<|tool>declaration:get_current_weather{description:<|"|>Gets the current weather in a given location.<|"|>,parameters:{properties:{location:{description:<|"|>The city and state, e.g. "San Francisco, CA" or "Tokyo, JP"<|"|>,type:<|"|>STRING<|"|>},unit:{description:<|"|>The unit to return the temperature in.<|"|>,enum:[<|"|>celsius<|"|>,<|"|>fahrenheit<|"|>],type:<|"|>STRING<|"|>} },required:[<|"|>location<|"|>],type:<|"|>OBJECT<|"|>} }<tool|><turn|>
<|turn>user
Hey, what's the weather in Tokyo right now?<turn|>
<|turn>model
<|tool_call>call:get_current_weather{location:<|"|>Tokyo, JP<|"|>}<tool_call|><|tool_response>response:get_current_weather{temperature:15,weather:<|"|>sunny<|"|>}<tool_response|>The current weather in Tokyo is 15 degrees and sunny.<turn|>

包含思路的函数调用

通过利用内部推理流程,模型可显著提高函数调用准确率。这样一来,您就可以更精确地决定何时触发工具以及如何定义其参数。

prompt = "Hey, I'm in Seoul. Is it good for running now?"
message = [
    {
        "role": "system", "content": "You are a helpful assistant."
    },
    {
        "role": "user", "content": prompt
    },
]

text = processor.apply_chat_template(message, tools=tools, tokenize=False, add_generation_prompt=True, enable_thinking=True)
inputs = processor(text=text, return_tensors="pt").to(model.device)
input_len = inputs["input_ids"].shape[-1]

out = model.generate(**inputs, max_new_tokens=1024)
output = processor.decode(out[0][input_len:], skip_special_tokens=False)
result = processor.parse_response(output)

for key, value in result.items():
  if key == "role":
    print(f"Role: {value}")
  elif key == "thinking":
    print(f"\n=== Thoughts ===\n{value}")
  elif key == "content":
    print(f"\n=== Answer ===\n{value}")
  elif key == "tool_calls":
    print(f"\n=== Tool Calls ===\n{value}")
  else:
    print(f"\n{key}: {value}...\n")
Role: assistant

=== Thoughts ===

1.  **Analyze the Request:** The user is asking if it's "good for running now" in "Seoul".

2.  **Identify Necessary Information:** To determine if it's good for running, I need current weather information (temperature, precipitation, etc.) for Seoul.

3.  **Examine Available Tools:** The available tool is `get_current_weather(location, unit)`.

4.  **Determine Tool Arguments:**
    *   `location`: The user specified "Seoul".
    *   `unit`: The user did not specify a unit (Celsius or Fahrenheit).

5.  **Formulate the Tool Call:** I need to call `get_current_weather` with the location. Since the user didn't specify a unit, I can either omit it (if the tool defaults are acceptable) or choose a common one. However, the tool definition requires `location` but `unit` is optional.

6.  **Construct the Response Strategy:**
    *   Call the tool to get the weather data for Seoul.
    *   Once the data is received, I can advise the user on whether it's suitable for running.

7.  **Generate Tool Call:**

    ```json
    {
      "toolSpec": {
        "name": "get_current_weather",
        "args": {
          "location": "Seoul"
        }
      }
    }
    ```
    (Self-correction: The `unit` parameter is optional in the definition, so just providing the location is sufficient to proceed.)

8.  **Final Output Generation:** Present the tool call to the user/system.

=== Tool Calls ===
[{'type': 'function', 'function': {'name': 'get_current_weather', 'arguments': {'location': 'Seoul'} } }]

处理工具调用并获取最终答案。

calls = extract_tool_calls(output)
if calls:
    # Call the function and get the result
    #####################################
    # WARNING: This is a demonstration. #
    #####################################
    # Using globals() to call functions dynamically can be dangerous in
    # production. In a real application, you should implement a secure way to
    # map function names to actual function calls, such as a predefined
    # dictionary of allowed tools and their implementations.
    results = [
        {"name": c['name'], "response": globals()[c['name']](**c['arguments'])}
        for c in calls
    ]

    message.append({
        "role": "assistant",
        "tool_calls": [
            {"function": call} for call in calls
        ],
        "tool_responses": results
    })

text = processor.apply_chat_template(message, tools=tools, tokenize=False, add_generation_prompt=True)
inputs = processor(text=text, return_tensors="pt").to(model.device)
out = model.generate(**inputs, max_new_tokens=128)
generated_tokens = out[0][len(inputs["input_ids"][0]):]
output = processor.decode(generated_tokens, skip_special_tokens=True)
print(f"Output: {output}")
message[-1]["content"] = output

print("-"*80)
print("Full History")
print("-"*80)
print(json.dumps(message, indent=2))
Output: The current weather in Seoul is 15 degrees Celsius and sunny. That sounds like great weather for a run!
--------------------------------------------------------------------------------
Full History
--------------------------------------------------------------------------------
[
  {
    "role": "system",
    "content": "You are a helpful assistant."
  },
  {
    "role": "user",
    "content": "Hey, I'm in Seoul. Is it good for running now?"
  },
  {
    "role": "assistant",
    "tool_calls": [
      {
        "function": {
          "name": "get_current_weather",
          "arguments": {
            "location": "Seoul"
          }
        }
      }
    ],
    "tool_responses": [
      {
        "name": "get_current_weather",
        "response": {
          "temperature": 15,
          "weather": "sunny"
        }
      }
    ],
    "content": "The current weather in Seoul is 15 degrees Celsius and sunny. That sounds like great weather for a run!"
  }
]

重要注意事项:自动架构与手动架构

如果依赖于从 Python 函数到 JSON 架构的自动转换,生成的输出可能并不总是能满足有关复杂参数的特定预期。

如果函数使用自定义对象(例如 Config 类)作为实参,自动转换器可能会简单地将其描述为泛型“对象”,而不会详细说明其内部属性。

在这些情况下,最好手动定义 JSON 架构,以确保为模型明确定义嵌套属性(例如配置对象中的主题或 font_size)。

import json
from transformers.utils import get_json_schema

class Config:
    def __init__(self):
        self.theme = "light"
        self.font_size = 14

def update_config(config: Config):
    """
    Updates the configuration of the system.

    Args:
        config: A Config object

    Returns:
        True if the configuration was successfully updated, False otherwise.
    """

update_config_schema = {
    "type": "function",
    "function": {
        "name": "update_config",
        "description": "Updates the configuration of the system.",
        "parameters": {
            "type": "object",
            "properties": {
                "config": {
                    "type": "object",
                    "description": "A Config object",
                    "properties": {"theme": {"type": "string"}, "font_size": {"type": "number"} },
                    },
                },
            "required": ["config"],
            },
        },
    }

print(f"--- [Automatic] ---")
print(json.dumps(get_json_schema(update_config), indent=2))

print(f"\n--- [Manual Schemas] ---")
print(json.dumps(update_config_schema, indent=2))
--- [Automatic] ---
{
  "type": "function",
  "function": {
    "name": "update_config",
    "description": "Updates the configuration of the system.",
    "parameters": {
      "type": "object",
      "properties": {
        "config": {
          "type": "object",
          "description": "A Config object"
        }
      },
      "required": [
        "config"
      ]
    }
  }
}

--- [Manual Schemas] ---
{
  "type": "function",
  "function": {
    "name": "update_config",
    "description": "Updates the configuration of the system.",
    "parameters": {
      "type": "object",
      "properties": {
        "config": {
          "type": "object",
          "description": "A Config object",
          "properties": {
            "theme": {
              "type": "string"
            },
            "font_size": {
              "type": "number"
            }
          }
        }
      },
      "required": [
        "config"
      ]
    }
  }
}

总结与后续步骤

您已了解如何构建可使用 Gemma 4 调用函数的应用。工作流程通过一个四阶段周期来建立:

  1. 定义工具:创建模型可以使用的函数,指定实参和说明(例如,天气查询函数)。
  2. 模型轮次:模型接收用户的提示和可用工具列表,并返回结构化的函数调用对象,而不是纯文本。
  3. 开发者回合:开发者使用正则表达式解析此输出,以提取函数名称和实参,执行实际的 Python 代码,并使用特定的工具角色将结果附加到聊天记录中。
  4. 最终回答:模型处理工具的执行结果,为用户生成最终的自然语言回答。

如需进一步了解,请参阅以下文档。