Google AI Edge Portal 隆重推出：大規模基準測試 Edge AI。申請在非公開預先發布版期間要求存取權。

本頁面由 Cloud Translation API 翻譯而成。

Android 適用的 AI Edge 函式呼叫指南

AI Edge 函式呼叫 SDK (FC SDK) 是可讓開發人員使用裝置端 LLM 的函式呼叫功能的程式庫。函式呼叫可讓您將模型連結至外部工具和 API，讓模型可透過必要參數呼叫特定函式，執行實際動作。

使用 FC SDK 的 LLM 不僅會產生文字，還能產生結構化呼叫，以便執行動作的函式，例如搜尋最新資訊、設定鬧鐘或預訂。

本指南將逐步引導您完成基本操作，將 LLM Inference API 與 FC SDK 新增至 Android 應用程式。本指南的重點在於將函式呼叫功能新增至裝置端 LLM。如要進一步瞭解如何使用 LLM Inference API，請參閱 Android 版 LLM Inference 指南。

快速入門導覽課程

請按照下列步驟在 Android 應用程式中使用 FC SDK。本快速入門課程會使用 LLM Inference API 搭配 Hammer 2.1 (1.5B)。LLM Inference API 是針對高階 Android 裝置 (例如 Pixel 8 和 Samsung S23 以上機型) 進行最佳化，不支援裝置模擬器。

新增依附元件

FC SDK 使用 com.google.ai.edge.localagents:localagents-fc 程式庫，而 LLM Inference API 則使用 com.google.mediapipe:tasks-genai 程式庫。將這兩個依附元件新增至 Android 應用程式的 build.gradle 檔案：

dependencies {
    implementation 'com.google.mediapipe:tasks-genai:0.10.24'
    implementation 'com.google.ai.edge.localagents:localagents-fc:0.1.0'
}

如果是搭載 Android 12 (API 31) 以上版本的裝置，請新增原生 OpenCL 程式庫依附元件。詳情請參閱 uses-native-library 標記的說明文件。

將下列 uses-native-library 標記新增至 AndroidManifest.xml 檔案：

<uses-native-library android:name="libOpenCL.so" android:required="false"/>
<uses-native-library android:name="libOpenCL-car.so" android:required="false"/>
<uses-native-library android:name="libOpenCL-pixel.so" android:required="false"/>

下載模型

從 Hugging Face 下載 Hammer 1B 的 8 位元量化格式。如要進一步瞭解可用模型，請參閱模型說明文件。

將 hammer2.1_1.5b_q8_ekv4096.task 資料夾的內容推送至 Android 裝置。

$ adb shell rm -r /data/local/tmp/llm/ # Remove any previously loaded models
$ adb shell mkdir -p /data/local/tmp/llm/
$ adb push hammer2.1_1.5b_q8_ekv4096.task /data/local/tmp/llm/hammer2.1_1.5b_q8_ekv4096.task

宣告函式定義

定義要提供給模型的函式。為說明這個程序，本快速入門會將兩個函式做為靜態方法，以便傳回硬式編碼的回應。更實用的實作方式會定義呼叫 REST API 或從資料庫擷取資訊的函式。

以下定義 getWeather 和 getTime 函式：

class ToolsForLlm {
    public static String getWeather(String location) {
        return "Cloudy, 56°F";
    }

    public static String getTime(String timezone) {
        return "7:00 PM " + timezone;
    }

    private ToolsForLlm() {}
}

使用 FunctionDeclaration 說明每個函式，為每個函式提供名稱和說明，並指定類型。這會告知模型函式的功能，以及何時要呼叫函式。

var getWeather = FunctionDeclaration.newBuilder()
    .setName("getWeather")
    .setDescription("Returns the weather conditions at a location.")
    .setParameters(
        Schema.newBuilder()
            .setType(Type.OBJECT)
            .putProperties(
                "location",
                Schema.newBuilder()
                    .setType(Type.STRING)
                    .setDescription("The location for the weather report.")
                    .build())
            .build())
    .build();
var getTime = FunctionDeclaration.newBuilder()
    .setName("getTime")
    .setDescription("Returns the current time in the given timezone.")

    .setParameters(
        Schema.newBuilder()
            .setType(Type.OBJECT)
            .putProperties(
                "timezone",
                Schema.newBuilder()
                    .setType(Type.STRING)
                    .setDescription("The timezone to get the time from.")
                    .build())
            .build())
    .build();

將函式宣告新增至 Tool 物件：

var tool = Tool.newBuilder()
    .addFunctionDeclarations(getWeather)
    .addFunctionDeclarations(getTime)
    .build();

建立推論後端

使用 LLM Inference API 建立推論後端，並將模型的格式化工具物件傳遞給該後端。FC SDK 格式化工具 (ModelFormatter) 同時具備格式化工具和剖析器的功能。由於本快速入門導覽課程使用 Gemma-3 1B，我們會使用 GemmaFormatter：

var llmInferenceOptions = LlmInferenceOptions.builder()
    .setModelPath(modelFile.getAbsolutePath())
    .build();
var llmInference = LlmInference.createFromOptions(context, llmInferenceOptions);
var llmInferenceBackend = new llmInferenceBackend(llmInference, new GemmaFormatter());

詳情請參閱 LLM 推論設定選項。

將模型例項化

使用 GenerativeModel 物件連結推論後端、系統提示和工具。我們已經有推論後端和工具，因此只需建立系統提示：

var systemInstruction = Content.newBuilder()
      .setRole("system")
      .addParts(Part.newBuilder().setText("You are a helpful assistant."))
      .build();

使用 GenerativeModel 建立模型的例項：

var generativeModel = new GenerativeModel(
    llmInferenceBackend,
    systemInstruction,
    List.of(tool),
)

開始即時通訊工作階段

為了簡化操作，本快速入門會啟動單一即時通訊工作階段。您也可以建立多個獨立的工作階段。

使用 GenerativeModel 的新例項，開始即時通訊工作階段：

var chat = generativeModel.startChat();

使用 sendMessage 方法，透過對話工作階段向模型傳送提示：

var response = chat.sendMessage("How's the weather in San Francisco?");

剖析模型回應

將提示傳遞給模型後，應用程式必須檢查回應，判斷是否要呼叫函式或輸出自然語言文字。

// Extract the model's message from the response.
var message = response.getCandidates(0).getContent().getParts(0);

// If the message contains a function call, execute the function.
if (message.hasFunctionCall()) {
  var functionCall = message.getFunctionCall();
  var args = functionCall.getArgs().getFieldsMap();
  var result = null;

  // Call the appropriate function.
  switch (functionCall.getName()) {
    case "getWeather":
      result = ToolsForLlm.getWeather(args.get("location").getStringValue());
      break;
    case "getTime":
      result = ToolsForLlm.getWeather(args.get("timezone").getStringValue());
      break;
    default:
      throw new Exception("Function does not exist:" + functionCall.getName());
  }
  // Return the result of the function call to the model.
  var functionResponse =
      FunctionResponse.newBuilder()
          .setName(functionCall.getName())
          .setResponse(
              Struct.newBuilder()
                  .putFields("result", Value.newBuilder().setStringValue(result).build()))
          .build();
  var functionResponseContent = Content.newBuilder()
        .setRole("user")
        .addParts(Part.newBuilder().setFunctionResponse(functionResponse))
        .build();
  var response = chat.sendMessage(functionResponseContent);
} else if (message.hasText()) {
  Log.i(message.getText());
}

範例程式碼是過度簡化的實作方式。如要進一步瞭解應用程式如何檢查模型回應，請參閱「格式設定和剖析」。

運作方式

本節將深入探討 Android 適用的 Function Calling SDK 的核心概念和元件。

模型

函式呼叫 SDK 需要包含格式化工具和剖析器的模型。FC SDK 包含下列模型的內建格式化工具和剖析器：

Gemma：使用 GemmaFormatter。
Llama：使用 LlamaFormatter。
鎚子：使用 HammerFormatter。

如要搭配 FC SDK 使用其他模型，您必須自行開發與 LLM 推論 API 相容的格式化工具和剖析器。

格式設定和剖析

函式呼叫支援功能的關鍵部分是提示格式設定和模型輸出的剖析。雖然這兩個步驟各自獨立，但 FC SDK 會透過 ModelFormatter 介面處理格式設定和剖析作業。

格式化器負責將結構化函式宣告轉換為文字、格式化函式回應，以及插入符記，用於指出對話輪次的開始和結束時間，以及這些輪次的角色 (例如「使用者」和「模型」)。

解析器負責偵測模型回應是否包含函式呼叫。如果剖析器偵測到函式呼叫，就會將其剖析為結構化資料類型。否則，系統會將文字視為自然語言回應。

受限解碼

受限解碼是一種技術，可引導大型語言模型的輸出產生作業，確保輸出內容遵循預先定義的結構化格式，例如 JSON 物件或 Python 函式呼叫。透過強制執行這些限制，模型會按照預先定義的函式及其對應的參數類型，調整輸出格式。

如要啟用受限解碼功能，請在 ConstraintOptions 物件中定義限制條件，並叫用 ChatSession 例項的 enableConstraint 方法。啟用這項限制後，回應就會受到限制，只包含與 GenerativeModel 相關聯的工具。

以下範例說明如何設定受限解碼功能，以限制對工具呼叫的回應。它會將工具呼叫限制為以前置字串 ```tool_code\n 開頭，並以後置字串 \n``` 結尾。

ConstraintOptions constraintOptions = ConstraintOptions.newBuilder()
  .setToolCallOnly( ConstraintOptions.ToolCallOnly.newBuilder()
  .setConstraintPrefix("```tool_code\n")
  .setConstraintSuffix("\n```"))
  .build();
chatSession.enableConstraint(constraintOptions);

如要在同一個工作階段內停用有效限制條件，請使用 disableConstraint 方法：

chatSession.disableConstraint();