Google AI Edge Portal 隆重推出：大規模基準測試 Edge AI。申請在非公開預先發布版期間要求存取權。

開始在 Android 上使用 LiteRT-LM

LiteRT-LM 的 Kotlin API 適用於 Android 和 JVM (Linux、MacOS、Windows)，並提供 GPU 和 NPU 加速、多模態和工具使用等功能。

簡介

以下是使用 Kotlin API 建構的終端機即時通訊應用程式範例：

import com.google.ai.edge.litertlm.*

suspend fun main() {
  Engine.setNativeMinLogSeverity(LogSeverity.ERROR) // Hide log for TUI app

  val engineConfig = EngineConfig(modelPath = "/path/to/model.litertlm")
  Engine(engineConfig).use { engine ->
    engine.initialize()

    engine.createConversation().use { conversation ->
      while (true) {
        print("\n>>> ")
        conversation.sendMessageAsync(readln()).collect { print(it) }
      }
    }
  }
}

Kotlin 範例程式碼的示範

如要試用上述範例，請複製存放區並使用 example/Main.kt 執行：

bazel run -c opt //kotlin/java/com/google/ai/edge/litertlm/example:main -- <abs_model_path>

如要查看可用的 .litertlm 模型，請前往 HuggingFace LiteRT 社群。上述動畫使用的是 Gemma3-1B-IT。

如需 Android 範例，請參閱 Google AI Edge Gallery 應用程式。

Gradle 入門

雖然 LiteRT-LM 是以 Bazel 開發，但我們為 Gradle/Maven 使用者提供 Maven 套件。

1. 新增 Gradle 依附元件

dependencies {
    // For Android
    implementation("com.google.ai.edge.litertlm:litertlm-android:latest.release")

    // For JVM (Linux, MacOS, Windows)
    implementation("com.google.ai.edge.litertlm:litertlm-jvm:latest.release")
}

您可以在 Google Maven 的 litertlm-android 和 litertlm-jvm 找到可用版本。

您可以使用 latest.release 取得最新版本。

2. 初始化引擎

Engine 是 API 的進入點。使用模型路徑和設定初始化。請記得關閉引擎，釋出資源。

注意：engine.initialize() 方法可能需要相當長的時間 (例如最多 10 秒) 才能載入模型。強烈建議您在背景執行緒或協同程式中呼叫此函式，避免封鎖 UI 執行緒。

import com.google.ai.edge.litertlm.Backend
import com.google.ai.edge.litertlm.Engine
import com.google.ai.edge.litertlm.EngineConfig

val engineConfig = EngineConfig(
    modelPath = "/path/to/your/model.litertlm", // Replace with your model path
    backend = Backend.GPU(), // Or Backend.NPU(nativeLibraryDir = "...")
    // Optional: Pick a writable dir. This can improve 2nd load time.
    // cacheDir = "/tmp/" or context.cacheDir.path (for Android)
)

val engine = Engine(engineConfig)
engine.initialize()
// ... Use the engine to create a conversation ...

// Close the engine when done
engine.close()

在 Android 上，如要使用 GPU 後端，應用程式必須在 <application> 標記內將下列項目新增至 AndroidManifest.xml，明確要求依附的原生程式庫：

  <application>
    <uses-native-library android:name="libvndksupport.so" android:required="false"/>
    <uses-native-library android:name="libOpenCL.so" android:required="false"/>
  </application>

如要使用 NPU 後端，您可能需要指定含有 NPU 程式庫的目錄。在 Android 上，如果程式庫與應用程式一併封裝，請設為 context.applicationInfo.nativeLibraryDir。如要進一步瞭解 NPU 原生程式庫，請參閱 LiteRT-LM NPU。

val engineConfig = EngineConfig(
    modelPath = modelPath,
    backend = Backend.NPU(nativeLibraryDir = context.applicationInfo.nativeLibraryDir)
)

3. 建立對話

引擎初始化完成後，請建立 Conversation 例項。您可以提供 ConversationConfig 來自訂其行為。

import com.google.ai.edge.litertlm.ConversationConfig
import com.google.ai.edge.litertlm.Message
import com.google.ai.edge.litertlm.SamplerConfig

// Optional: Configure the system instruction, initial messages, sampling
// parameters, etc.
val conversationConfig = ConversationConfig(
    systemInstruction = Contents.of("You are a helpful assistant."),
    initialMessages = listOf(
        Message.user("What is the capital city of the United States?"),
        Message.model("Washington, D.C."),
    ),
    samplerConfig = SamplerConfig(topK = 10, topP = 0.95, temperature = 0.8),
)

val conversation = engine.createConversation(conversationConfig)
// Or with default config:
// val conversation = engine.createConversation()

// ... Use the conversation ...

// Close the conversation when done
conversation.close()

Conversation 會實作 AutoCloseable，因此您可以將 use 區塊用於一次性或短期對話的自動資源管理：

engine.createConversation(conversationConfig).use { conversation ->
    // Interact with the conversation
}

4. 傳送訊息

你可以透過三種方式傳送訊息：

sendMessage(contents): Message：同步呼叫，會封鎖直到模型傳回完整的回覆為止。這項方法適用於基本的請求/回應互動。
sendMessageAsync(contents, callback)：用於串流回應的非同步呼叫。如果要求需要長時間執行，或您想在生成回應時顯示回應，則更適合使用這項方法。
sendMessageAsync(contents): Flow<Message>：非同步呼叫，會傳回用於串流回應的 Kotlin Flow。建議使用協同程式的使用者採用這種做法。

同步範例：

import com.google.ai.edge.litertlm.Content
import com.google.ai.edge.litertlm.Message

print(conversation.sendMessage("What is the capital of France?"))

非同步範例 (含回呼)：

使用 sendMessageAsync 將訊息傳送至模型，並透過回呼接收回覆。

import com.google.ai.edge.litertlm.Content
import com.google.ai.edge.litertlm.Message
import com.google.ai.edge.litertlm.MessageCallback
import java.util.concurrent.CountDownLatch
import java.util.concurrent.TimeUnit

val callback = object : MessageCallback {
    override fun onMessage(message: Message) {
        print(message)
    }

    override fun onDone() {
        // Streaming completed
    }

    override fun onError(throwable: Throwable) {
        // Error during streaming
    }
}

conversation.sendMessageAsync("What is the capital of France?", callback)

非同步範例 (含流程)：

使用 sendMessageAsync (不含回呼引數) 將訊息傳送至模型，並透過 Kotlin Flow 接收回覆。

import com.google.ai.edge.litertlm.Content
import com.google.ai.edge.litertlm.Message
import kotlinx.coroutines.flow.catch
import kotlinx.coroutines.launch

// Within a coroutine scope
conversation.sendMessageAsync("What is the capital of France?")
    .catch { ... } // Error during streaming
    .collect { print(it.toString()) }

5. 多模態

Message 物件可以包含不同類型的 Content，包括 Text、ImageBytes、ImageFile、AudioBytes 和 AudioFile。

// Initialize the `visionBackend` and/or the `audioBackend`
val engineConfig = EngineConfig(
    modelPath = "/path/to/your/model.litertlm", // Replace with your model path
    backend = Backend.CPU(), // Or Backend.GPU() or Backend.NPU(...)
    visionBackend = Backend.GPU(), // Or Backend.NPU(...)
    audioBackend = Backend.CPU(), // Or Backend.NPU(...)
)

// Sends a message with multi-modality.
// See the Content class for other variants.
conversation.sendMessage(Contents.of(
    Content.ImageFile("/path/to/image"),
    Content.AudioBytes(audioBytes), // ByteArray of the audio
    Content.Text("Describe this image and audio."),
))

6. 定義及使用工具

定義工具有兩種方式：

使用 Kotlin 函式 (建議在大多數情況下使用)
使用 OpenAPI 規格 (完全掌控工具規格和執行作業)

使用 Kotlin 函式定義工具

您可以將自訂 Kotlin 函式定義為工具，供模型呼叫以執行動作或擷取資訊。

建立實作 ToolSet 的類別，並使用 @Tool 註解方法，以及使用 @ToolParam 註解參數。

import com.google.ai.edge.litertlm.Tool
import com.google.ai.edge.litertlm.ToolParam

class SampleToolSet: ToolSet {
    @Tool(description = "Get the current weather for a city")
    fun getCurrentWeather(
        @ToolParam(description = "The city name, e.g., San Francisco") city: String,
        @ToolParam(description = "Optional country code, e.g., US") country: String? = null,
        @ToolParam(description = "Temperature unit (celsius or fahrenheit). Default: celsius") unit: String = "celsius"
    ): Map<String, Any> {
        // In a real application, you would call a weather API here
        return mapOf("temperature" to 25, "unit" to  unit, "condition" to "Sunny")
    }

    @Tool(description = "Get the sum of a list of numbers.")
    fun sum(
        @ToolParam(description = "The numbers, could be floating point.") numbers: List<Double>,
    ): Double {
        return numbers.sum()
    }
}

在幕後，API 會檢查這些註解和函式簽章，產生 OpenAPI 樣式的結構定義。這個結構定義會向語言模型說明工具的功能、參數 (包括來自 @ToolParam 的類型和說明)，以及傳回類型。

參數類型

以 @ToolParam 註解的參數類型可以是 String、Int、Boolean、Float、Double，或是這些類型的 List (例如 List<String>)。使用可為空值的型別 (例如 String?)，表示可為空值的參數。設定預設值，指出參數為選用，並在 @ToolParam 的說明中提及預設值。

傳回類型

工具函式的傳回型別可以是任何 Kotlin 型別。結果會先轉換為 JSON 元素，再傳回模型。

List 類型會轉換為 JSON 陣列。
Map 型別會轉換為 JSON 物件。
原始型別 (String、Number、Boolean) 會轉換為對應的 JSON 原始型別。
其他型別則會使用 toString() 方法轉換為字串。

如果是結構化資料，建議傳回 Map 或將轉換為 JSON 物件的資料類別。

使用 OpenAPI 規格定義工具

或者，您也可以實作 OpenApiTool 類別，並以符合 Open API 規格的 JSON 字串提供工具說明，藉此定義工具。如果您已有工具的 OpenAPI 結構定義，或需要精細控管工具定義，這個方法就非常實用。

import com.google.ai.edge.litertlm.OpenApiTool

class SampleOpenApiTool : OpenApiTool {

    override fun getToolDescriptionJsonString(): String {
        return """
        {
          "name": "addition",
          "description": "Add all numbers.",
          "parameters": {
            "type": "object",
            "properties": {
              "numbers": {
                "type": "array",
                "items": {
                  "type": "number"
                }
              },
              "description": "The list of numbers to sum."
            },
            "required": [
              "numbers"
            ]
          }
        }
        """.trimIndent() // Tip: trim to save tokens
    }

    override fun execute(paramsJsonString: String): String {
        // Parse paramsJsonString with your choice of parser/deserializer and
        // execute the tool.

        // Return the result as a JSON string
        return """{"result": 1.4142}"""
    }
}

註冊工具

在 ConversationConfig 中加入工具執行個體。

val conversation = engine.createConversation(
    ConversationConfig(
        tools = listOf(
            tool(SampleToolSet()),
            tool(SampleOpenApiTool()),
        ),
        // ... other configs
    )
)

// Send messages that might trigger the tool
conversation.sendMessageAsync("What's the weather like in London?", callback)

模型會根據對話內容，決定何時呼叫工具。工具執行結果會自動傳回模型，以生成最終回覆。

手動呼叫工具

根據預設，模型生成的工具呼叫會由 LiteRT-LM 自動執行，工具執行結果也會自動傳回模型，用於生成下一個回覆。

如要手動執行工具並將結果傳回模型，可以在 ConversationConfig 中將 automaticToolCalling 設為 false。

val conversation = engine.createConversation(
    ConversationConfig(
        tools = listOf(
            tool(SampleOpenApiTool()),
        ),
        automaticToolCalling = false,
    )
)

如果停用自動呼叫工具功能，您必須在應用程式程式碼中手動執行工具，並將結果傳回模型。當 automaticToolCalling 設為 false 時，系統「execute」不會自動呼叫 OpenApiTool 的方法。

// Send a message that triggers a tool call.
val responseMessage = conversation.sendMessage("What's the weather like in London?")

// The model returns a Message with `toolCalls` populated.
if (responseMessage.toolCalls.isNotEmpty()) {
    val toolResponses = mutableListOf<Content.ToolResponse>()
    // There can be multiple tool calls in a single response.
    for (toolCall in responseMessage.toolCalls) {
        println("Model wants to call: ${toolCall.name} with arguments: ${toolCall.arguments}")

        // Execute the tool manually with your own logic. `executeTool` is just an example here.
        val toolResponseJson = executeTool(toolCall.name, toolCall.arguments)

        // Collect tool responses.
        toolResponses.add(Content.ToolResponse(toolCall.name, toolResponseJson))
    }

    // Use Message.tool to create the tool response message.
    val toolResponseMessage = Message.tool(Contents.of(toolResponses))

    // Send the tool response message to the model.
    val finalMessage = conversation.sendMessage(toolResponseMessage)
    println("Final answer: ${finalMessage.text}") // e.g., "The weather in London is 25c."
}

範例

如要試用工具，請複製存放區並使用 example/ToolMain.kt 執行：

bazel run -c opt //kotlin/java/com/google/ai/edge/litertlm/example:tool -- <abs_model_path>

處理錯誤

API 方法可能會針對原生層的錯誤擲回 LiteRtLmJniException，或針對生命週期問題擲回標準 Kotlin 例外狀況 (例如 IllegalStateException)。請務必將 API 呼叫包裝在 try-catch 區塊中。MessageCallback 中的 onError 回呼也會回報非同步作業期間的錯誤。