ขอแนะนํา Google AI Edge Portal: เปรียบเทียบประสิทธิภาพ AI บนอุปกรณ์ขอบในวงกว้าง ลงชื่อสมัครใช้เพื่อขอสิทธิ์เข้าถึงในช่วงเวอร์ชันตัวอย่างก่อนเปิดตัว

เริ่มต้นใช้งาน LiteRT-LM บน Android

Kotlin API ของ LiteRT-LM สำหรับ Android และ JVM (Linux, MacOS, Windows) พร้อมฟีเจอร์ต่างๆ เช่น การเร่งความเร็ว GPU และ NPU, มัลติโมดัล และ การใช้เครื่องมือ

บทนำ

ต่อไปนี้คือตัวอย่างแอปแชทในเทอร์มินัลที่สร้างด้วย Kotlin API

import com.google.ai.edge.litertlm.*

suspend fun main() {
  Engine.setNativeMinLogSeverity(LogSeverity.ERROR) // Hide log for TUI app

  val engineConfig = EngineConfig(modelPath = "/path/to/model.litertlm")
  Engine(engineConfig).use { engine ->
    engine.initialize()

    engine.createConversation().use { conversation ->
      while (true) {
        print("\n>>> ")
        conversation.sendMessageAsync(readln()).collect { print(it) }
      }
    }
  }
}

การสาธิตโค้ดตัวอย่าง Kotlin

หากต้องการลองใช้ตัวอย่างข้างต้น ให้โคลนที่เก็บและเรียกใช้ด้วย example/Main.kt

bazel run -c opt //kotlin/java/com/google/ai/edge/litertlm/example:main -- <abs_model_path>

.litertlmโมเดลที่พร้อมใช้งานจะอยู่ในชุมชน HuggingFace LiteRT ภาพเคลื่อนไหวด้านบนใช้โมเดล Gemma3-1B-IT

สำหรับตัวอย่าง Android ให้ดูแอป Google AI Edge Gallery

การเริ่มต้นใช้งาน Gradle

แม้ว่า LiteRT-LM จะได้รับการพัฒนาด้วย Bazel แต่เราก็มีแพ็กเกจ Maven สำหรับผู้ใช้ Gradle/Maven

1. เพิ่มการอ้างอิง Gradle

dependencies {
    // For Android
    implementation("com.google.ai.edge.litertlm:litertlm-android:latest.release")

    // For JVM (Linux, MacOS, Windows)
    implementation("com.google.ai.edge.litertlm:litertlm-jvm:latest.release")
}

คุณดูเวอร์ชันที่มีให้บริการได้ใน Google Maven ที่ litertlm-android และ litertlm-jvm

latest.release ใช้เพื่อรับรุ่นล่าสุดได้

2. เริ่มต้น Engine

Engine คือจุดแรกเข้าของ API เริ่มต้นด้วยเส้นทางโมเดล และการกำหนดค่า อย่าลืมปิดเครื่องมือเพื่อปล่อยทรัพยากร

หมายเหตุ: วิธี engine.initialize() อาจใช้เวลานานในการโหลดโมเดล (เช่น สูงสุด 10 วินาที) ขอแนะนำอย่างยิ่งให้เรียกใช้ ฟังก์ชันนี้ในเธรดพื้นหลังหรือโครูทีนเพื่อหลีกเลี่ยงการบล็อกเธรด UI

import com.google.ai.edge.litertlm.Backend
import com.google.ai.edge.litertlm.Engine
import com.google.ai.edge.litertlm.EngineConfig

val engineConfig = EngineConfig(
    modelPath = "/path/to/your/model.litertlm", // Replace with your model path
    backend = Backend.GPU(), // Or Backend.NPU(nativeLibraryDir = "...")
    // Optional: Pick a writable dir. This can improve 2nd load time.
    // cacheDir = "/tmp/" or context.cacheDir.path (for Android)
)

val engine = Engine(engineConfig)
engine.initialize()
// ... Use the engine to create a conversation ...

// Close the engine when done
engine.close()

ใน Android หากต้องการใช้แบ็กเอนด์ GPU แอปจะต้องขอไลบรารีเนทีฟที่ขึ้นอยู่กับ อย่างชัดเจนโดยการเพิ่มข้อมูลต่อไปนี้ลงใน AndroidManifest.xml ภายในแท็ก <application>

  <application>
    <uses-native-library android:name="libvndksupport.so" android:required="false"/>
    <uses-native-library android:name="libOpenCL.so" android:required="false"/>
  </application>

หากต้องการใช้แบ็กเอนด์ NPU คุณอาจต้องระบุไดเรกทอรีที่มี ไลบรารี NPU ใน Android หากรวมไลบรารีไว้กับแอป ให้ตั้งค่าเป็น context.applicationInfo.nativeLibraryDir ดูรายละเอียดเพิ่มเติม เกี่ยวกับไลบรารี NPU ดั้งเดิมได้ที่ LiteRT-LM NPU

val engineConfig = EngineConfig(
    modelPath = modelPath,
    backend = Backend.NPU(nativeLibraryDir = context.applicationInfo.nativeLibraryDir)
)

3. สร้างการสนทนา

เมื่อเริ่มต้นเครื่องมือแล้ว ให้สร้างอินสแตนซ์ Conversation คุณสามารถ ระบุ ConversationConfig เพื่อปรับแต่งลักษณะการทำงานได้

import com.google.ai.edge.litertlm.ConversationConfig
import com.google.ai.edge.litertlm.Message
import com.google.ai.edge.litertlm.SamplerConfig

// Optional: Configure the system instruction, initial messages, sampling
// parameters, etc.
val conversationConfig = ConversationConfig(
    systemInstruction = Contents.of("You are a helpful assistant."),
    initialMessages = listOf(
        Message.user("What is the capital city of the United States?"),
        Message.model("Washington, D.C."),
    ),
    samplerConfig = SamplerConfig(topK = 10, topP = 0.95, temperature = 0.8),
)

val conversation = engine.createConversation(conversationConfig)
// Or with default config:
// val conversation = engine.createConversation()

// ... Use the conversation ...

// Close the conversation when done
conversation.close()

Conversation ใช้ AutoCloseable คุณจึงใช้บล็อก use เพื่อ การจัดการทรัพยากรอัตโนมัติสำหรับการสนทนาแบบครั้งเดียวหรือแบบชั่วคราวได้

engine.createConversation(conversationConfig).use { conversation ->
    // Interact with the conversation
}

4. การส่งข้อความ

คุณส่งข้อความได้ 3 วิธีดังนี้

sendMessage(contents): Message: การเรียกแบบซิงโครนัสที่บล็อกจนกว่าโมเดลจะส่งคืนคำตอบที่สมบูรณ์ ซึ่งจะง่ายกว่าสำหรับการโต้ตอบคำขอ/คำตอบพื้นฐาน
sendMessageAsync(contents, callback): การเรียกแบบไม่พร้อมกันสำหรับการสตรีมคำตอบ วิธีนี้เหมาะกว่าสำหรับคำขอที่ใช้เวลานานหรือเมื่อคุณต้องการ แสดงคำตอบขณะที่ระบบกำลังสร้างคำตอบ
sendMessageAsync(contents): Flow<Message>: การเรียกแบบไม่พร้อมกันซึ่ง จะแสดงผล Kotlin Flow สำหรับการสตรีมการตอบกลับ วิธีนี้เป็นวิธีที่แนะนำ สำหรับผู้ใช้ Coroutine

ตัวอย่างแบบซิงโครนัส:

import com.google.ai.edge.litertlm.Content
import com.google.ai.edge.litertlm.Message

print(conversation.sendMessage("What is the capital of France?"))

ตัวอย่างแบบอะซิงโครนัสที่มีการเรียกกลับ:

ใช้ sendMessageAsync เพื่อส่งข้อความไปยังโมเดลและรับการตอบกลับ ผ่านการเรียกกลับ

import com.google.ai.edge.litertlm.Content
import com.google.ai.edge.litertlm.Message
import com.google.ai.edge.litertlm.MessageCallback
import java.util.concurrent.CountDownLatch
import java.util.concurrent.TimeUnit

val callback = object : MessageCallback {
    override fun onMessage(message: Message) {
        print(message)
    }

    override fun onDone() {
        // Streaming completed
    }

    override fun onError(throwable: Throwable) {
        // Error during streaming
    }
}

conversation.sendMessageAsync("What is the capital of France?", callback)

ตัวอย่างแบบอะซิงโครนัสที่มีโฟลว์:

ใช้ sendMessageAsync (ไม่มีอาร์กิวเมนต์การเรียกกลับ) เพื่อส่งข้อความไปยังโมเดล และรับการตอบกลับผ่าน Kotlin Flow

import com.google.ai.edge.litertlm.Content
import com.google.ai.edge.litertlm.Message
import kotlinx.coroutines.flow.catch
import kotlinx.coroutines.launch

// Within a coroutine scope
conversation.sendMessageAsync("What is the capital of France?")
    .catch { ... } // Error during streaming
    .collect { print(it.toString()) }

5. ความสามารถในการประมวลผลข้อมูลหลายรูปแบบ

ออบเจ็กต์ Message สามารถมี Content ประเภทต่างๆ รวมถึง Text, ImageBytes, ImageFile, AudioBytes และ AudioFile

// Initialize the `visionBackend` and/or the `audioBackend`
val engineConfig = EngineConfig(
    modelPath = "/path/to/your/model.litertlm", // Replace with your model path
    backend = Backend.CPU(), // Or Backend.GPU() or Backend.NPU(...)
    visionBackend = Backend.GPU(), // Or Backend.NPU(...)
    audioBackend = Backend.CPU(), // Or Backend.NPU(...)
)

// Sends a message with multi-modality.
// See the Content class for other variants.
conversation.sendMessage(Contents.of(
    Content.ImageFile("/path/to/image"),
    Content.AudioBytes(audioBytes), // ByteArray of the audio
    Content.Text("Describe this image and audio."),
))

6. การกำหนดและการใช้เครื่องมือ

การกำหนดเครื่องมือทำได้ 2 วิธี ดังนี้

ใช้ฟังก์ชัน Kotlin (แนะนำสำหรับกรณีส่วนใหญ่)
ด้วยข้อกำหนดของ Open API (ควบคุมข้อมูลจำเพาะและการดำเนินการของเครื่องมือได้อย่างเต็มที่)

การกำหนดเครื่องมือด้วยฟังก์ชัน Kotlin

คุณสามารถกำหนดฟังก์ชัน Kotlin ที่กำหนดเองเป็นเครื่องมือที่โมเดลเรียกใช้เพื่อ ดำเนินการหรือดึงข้อมูลได้

สร้างคลาสที่ใช้ ToolSet และใส่คำอธิบายประกอบในเมธอดด้วย @Tool และ พารามิเตอร์ด้วย @ToolParam

import com.google.ai.edge.litertlm.Tool
import com.google.ai.edge.litertlm.ToolParam

class SampleToolSet: ToolSet {
    @Tool(description = "Get the current weather for a city")
    fun getCurrentWeather(
        @ToolParam(description = "The city name, e.g., San Francisco") city: String,
        @ToolParam(description = "Optional country code, e.g., US") country: String? = null,
        @ToolParam(description = "Temperature unit (celsius or fahrenheit). Default: celsius") unit: String = "celsius"
    ): Map<String, Any> {
        // In a real application, you would call a weather API here
        return mapOf("temperature" to 25, "unit" to  unit, "condition" to "Sunny")
    }

    @Tool(description = "Get the sum of a list of numbers.")
    fun sum(
        @ToolParam(description = "The numbers, could be floating point.") numbers: List<Double>,
    ): Double {
        return numbers.sum()
    }
}

เบื้องหลัง API จะตรวจสอบคำอธิบายประกอบเหล่านี้และลายเซ็นฟังก์ชัน เพื่อสร้างสคีมาสไตล์ OpenAPI สคีมานี้อธิบายฟังก์ชันการทำงาน พารามิเตอร์ (รวมถึงประเภทและคำอธิบายจาก @ToolParam) และประเภทการคืนค่าของเครื่องมือให้กับโมเดลภาษา

ประเภทพารามิเตอร์

ประเภทของพารามิเตอร์ที่อธิบายประกอบด้วย @ToolParam อาจเป็น String, Int, Boolean, Float, Double หรือ List ของประเภทเหล่านี้ (เช่น List<String>) ใช้ประเภทที่กำหนดให้เป็น Null ได้ (เช่น String?) เพื่อระบุพารามิเตอร์ที่อนุญาตให้เป็นค่าว่าง ตั้งค่าเริ่มต้นเพื่อระบุว่าพารามิเตอร์เป็นแบบไม่บังคับ และระบุค่าเริ่มต้นในคำอธิบายใน @ToolParam

ประเภทการแสดงผล

ประเภทการคืนค่าของฟังก์ชันเครื่องมือจะเป็นประเภท Kotlin ใดก็ได้ ระบบจะแปลงผลลัพธ์เป็นองค์ประกอบ JSON ก่อนส่งกลับไปยังโมเดล

List จะได้รับการแปลงเป็นอาร์เรย์ JSON
ระบบจะแปลงMapประเภทเป็นออบเจ็กต์ JSON
ระบบจะแปลงประเภทดั้งเดิม (String, Number, Boolean) เป็น ดั้งเดิมของ JSON ที่เกี่ยวข้อง
ระบบจะแปลงประเภทอื่นๆ เป็นสตริงด้วยเมธอด toString()

สำหรับข้อมูลที่มีโครงสร้าง ขอแนะนำให้ส่งคืน Map หรือคลาสข้อมูลที่จะแปลงเป็นออบเจ็กต์ JSON

การกำหนดเครื่องมือด้วยข้อกำหนด OpenAPI

หรือจะกำหนดเครื่องมือโดยการใช้คลาส OpenApiTool และ ระบุคำอธิบายของเครื่องมือเป็นสตริง JSON ที่เป็นไปตามข้อกำหนดของ Open API ก็ได้ วิธีนี้มีประโยชน์ในกรณีที่คุณมีสคีมา OpenAPI สำหรับเครื่องมืออยู่แล้ว หรือหากคุณต้องการควบคุมคำจำกัดความของเครื่องมืออย่างละเอียด

import com.google.ai.edge.litertlm.OpenApiTool

class SampleOpenApiTool : OpenApiTool {

    override fun getToolDescriptionJsonString(): String {
        return """
        {
          "name": "addition",
          "description": "Add all numbers.",
          "parameters": {
            "type": "object",
            "properties": {
              "numbers": {
                "type": "array",
                "items": {
                  "type": "number"
                }
              },
              "description": "The list of numbers to sum."
            },
            "required": [
              "numbers"
            ]
          }
        }
        """.trimIndent() // Tip: trim to save tokens
    }

    override fun execute(paramsJsonString: String): String {
        // Parse paramsJsonString with your choice of parser/deserializer and
        // execute the tool.

        // Return the result as a JSON string
        return """{"result": 1.4142}"""
    }
}

เครื่องมือลงทะเบียน

รวมอินสแตนซ์ของเครื่องมือไว้ใน ConversationConfig

val conversation = engine.createConversation(
    ConversationConfig(
        tools = listOf(
            tool(SampleToolSet()),
            tool(SampleOpenApiTool()),
        ),
        // ... other configs
    )
)

// Send messages that might trigger the tool
conversation.sendMessageAsync("What's the weather like in London?", callback)

โมเดลจะตัดสินใจว่าจะเรียกใช้เครื่องมือเมื่อใดโดยอิงตามการสนทนา ระบบจะส่ง ผลลัพธ์จากการดำเนินการเครื่องมือกลับไปยังโมเดลโดยอัตโนมัติเพื่อ สร้างคำตอบสุดท้าย

การเรียกใช้เครื่องมือด้วยตนเอง

โดยค่าเริ่มต้น LiteRT-LM จะเรียกใช้การเรียกเครื่องมือที่โมเดลสร้างขึ้นโดยอัตโนมัติ และระบบจะส่งผลลัพธ์จากการเรียกใช้เครื่องมือกลับไปยังโมเดลโดยอัตโนมัติเพื่อสร้างคำตอบถัดไป

หากต้องการเรียกใช้เครื่องมือด้วยตนเองและส่งผลลัพธ์กลับไปยังโมเดล คุณ สามารถตั้งค่า automaticToolCalling ใน ConversationConfig เป็น false ได้

val conversation = engine.createConversation(
    ConversationConfig(
        tools = listOf(
            tool(SampleOpenApiTool()),
        ),
        automaticToolCalling = false,
    )
)

หากปิดใช้การเรียกใช้เครื่องมืออัตโนมัติ คุณจะต้องเรียกใช้เครื่องมือด้วยตนเอง และส่งผลลัพธ์กลับไปยังโมเดลในโค้ดแอปพลิเคชัน ระบบจะไม่เรียกใช้เมธอด execute ของ OpenApiTool โดยอัตโนมัติเมื่อตั้งค่า automaticToolCalling เป็น false

// Send a message that triggers a tool call.
val responseMessage = conversation.sendMessage("What's the weather like in London?")

// The model returns a Message with `toolCalls` populated.
if (responseMessage.toolCalls.isNotEmpty()) {
    val toolResponses = mutableListOf<Content.ToolResponse>()
    // There can be multiple tool calls in a single response.
    for (toolCall in responseMessage.toolCalls) {
        println("Model wants to call: ${toolCall.name} with arguments: ${toolCall.arguments}")

        // Execute the tool manually with your own logic. `executeTool` is just an example here.
        val toolResponseJson = executeTool(toolCall.name, toolCall.arguments)

        // Collect tool responses.
        toolResponses.add(Content.ToolResponse(toolCall.name, toolResponseJson))
    }

    // Use Message.tool to create the tool response message.
    val toolResponseMessage = Message.tool(Contents.of(toolResponses))

    // Send the tool response message to the model.
    val finalMessage = conversation.sendMessage(toolResponseMessage)
    println("Final answer: ${finalMessage.text}") // e.g., "The weather in London is 25c."
}

ตัวอย่าง

หากต้องการลองใช้เครื่องมือ ให้โคลนที่เก็บและเรียกใช้ด้วย example/ToolMain.kt

bazel run -c opt //kotlin/java/com/google/ai/edge/litertlm/example:tool -- <abs_model_path>

การจัดการข้อผิดพลาด

เมธอด API สามารถส่ง LiteRtLmJniException สำหรับข้อผิดพลาดจากเลเยอร์เนทีฟหรือข้อยกเว้น Kotlin มาตรฐาน เช่น IllegalStateException สำหรับปัญหาเกี่ยวกับวงจร รวมการเรียก API ไว้ในบล็อก try-catch เสมอ โดยการเรียกกลับ onError ใน MessageCallback จะรายงานข้อผิดพลาดระหว่างการดำเนินการแบบไม่พร้อมกันด้วย