ขอแนะนํา Google AI Edge Portal: เปรียบเทียบประสิทธิภาพ AI บนอุปกรณ์ขอบในวงกว้าง ลงชื่อสมัครใช้เพื่อขอสิทธิ์เข้าถึงในช่วงเวอร์ชันตัวอย่างก่อนเปิดตัว

Google uses AI technology to translate content into your preferred language. AI translations can contain errors.

LiteRT-LM Python API

Python API ของ LiteRT-LM สำหรับ Linux, macOS และ Windows รองรับฟีเจอร์ต่างๆ เช่น มัลติโมดัล การใช้เครื่องมือ และการเร่งความเร็วด้วย GPU และ NPU ได้รับ การสนับสนุน

บทนำ

นี่คือตัวอย่างแอปแชทในเทอร์มินัลที่สร้างด้วย Python API

import litert_lm

litert_lm.set_min_log_severity(litert_lm.LogSeverity.ERROR) # Hide log for TUI app

with litert_lm.Engine("path/to/model.litertlm") as engine:
  with engine.create_conversation() as conversation:
    while True:
      user_input = input("\n>>> ")
      for chunk in conversation.send_message_async(user_input):
        print(chunk["content"][0]["text"], end="", flush=True)

เริ่มต้นใช้งาน

LiteRT-LM พร้อมให้บริการในรูปแบบไลบรารี Python โดยคุณสามารถติดตั้งแพ็กเกจจาก PyPI ได้ดังนี้

# Using pip
pip install litert-lm-api

# Using uv
uv pip install litert-lm-api

เริ่มต้นใช้งานเครื่องมือ

Engine คือจุดเริ่มต้นของ API ซึ่งจะจัดการการโหลดโมเดลและการจัดการทรัพยากร การใช้ `Engine` เป็นตัวจัดการบริบท (ด้วยคำสั่ง with) จะช่วยให้มั่นใจได้ว่าระบบจะปล่อยทรัพยากรทันที

หมายเหตุ: การเริ่มต้นใช้งานเครื่องมืออาจใช้เวลาหลายวินาทีในการโหลดโมเดล

import litert_lm

# Initialize with the model path and optionally specify the backend.
# backend can be Backend.CPU() (default), Backend.GPU() or Backend.NPU().
with litert_lm.Engine(
    "path/to/your/model.litertlm",
    backend=litert_lm.Backend.GPU(),
    # Optional: Pick a writable dir for caching compiled artifacts.
    # cache_dir="/tmp/litert-lm-cache"
) as engine:
    # ... Use the engine to create a conversation ...
    pass

สร้างการสนทนา

Conversation จะจัดการสถานะและประวัติการโต้ตอบกับโมเดล

# Optional: Configure system instruction and initial messages
messages = [litert_lm.Message.system("You are a helpful assistant.")]

# Create the conversation
with engine.create_conversation(messages=messages) as conversation:
    # ... Interact with the conversation ...
    pass

การส่งข้อความ

คุณสามารถส่งข้อความแบบซิงโครนัสหรืออะซิงโครนัส (การสตรีม) ก็ได้

เมธอด send_message และ send_message_async รับข้อมูลต่อไปนี้

str (ห่อเป็นข้อความผู้ใช้โดยอัตโนมัติ)
ออบเจ็กต์ litert_lm.Contents (สำหรับอินพุตมัลติโมดัล)
ออบเจ็กต์ litert_lm.Message (สำหรับโครงสร้างข้อความแบบเต็ม)
ออบเจ็กต์พจนานุกรมคล้าย JSON เป็นอินพุตเทมเพลตพรอมต์

ตัวอย่างแบบซิงโครนัส:

# Simple string input
response = conversation.send_message("What is the capital of France?")
print(response["content"][0]["text"])

# Or with a Message object
# response = conversation.send_message(litert_lm.Message.user("What is the capital of France?"))

ตัวอย่างแบบอะซิงโครนัส (การสตรีม):

# sendMessageAsync returns an iterator of response chunks
stream = conversation.send_message_async("Tell me a long story.")
for chunk in stream:
    # Chunks are dictionaries containing pieces of the response
    for item in chunk.get("content", []):
      if item.get("type") == "text":
        print(item["text"], end="", flush=True)
print()

🔴 ใหม่: การคาดการณ์หลายโทเค็น (MTP)

การคาดการณ์หลายโทเค็น (MTP) เป็นการเพิ่มประสิทธิภาพที่ช่วยเร่งความเร็วในการถอดรหัสได้อย่างมาก เราขอแนะนำให้ใช้ MTP กับงานทั้งหมดในแบ็กเอนด์ GPU

หากต้องการใช้ MTP ให้เปิดใช้การถอดรหัสแบบคาดการณ์เมื่อเริ่มต้นใช้งานเครื่องมือ

import litert_lm

# Enable MTP by setting enable_speculative_decoding=True
with litert_lm.Engine(
    "path/to/your/model.litertlm",
    backend=litert_lm.Backend.GPU(),
    enable_speculative_decoding=True,
) as engine:
    with engine.create_conversation() as conversation:
        response = conversation.send_message("What is the capital of France?")
        print(response["content"][0]["text"])

มัลติโมดัล

# Initialize with vision and/or audio backends if needed
with litert_lm.Engine(
    "path/to/multimodal_model.litertlm",
    audio_backend=litert_lm.Backend.CPU(),
    vision_backend=litert_lm.Backend.GPU(),
) as engine:
    with engine.create_conversation() as conversation:
        response = conversation.send_message(
            litert_lm.Contents.of(
                "Describe this audio.",
                litert_lm.Content.AudioFile(absolute_path="/path/to/audio.wav"),
            )
        )
        print(response["content"][0]["text"])

การกำหนดและการใช้เครื่องมือ

คุณสามารถกำหนดฟังก์ชัน Python เป็นเครื่องมือที่โมเดลเรียกใช้ได้โดยอัตโนมัติ

def add_numbers(a: float, b: float) -> float:
    """Adds two numbers.

    Args:
        a: The first number.
        b: The second number.
    """
    return a + b

# Register the tool in the conversation
tools = [add_numbers]
with engine.create_conversation(tools=tools) as conversation:
    # The model will call add_numbers automatically if it needs to sum values
    response = conversation.send_message("What is 123 + 456?")
    print(response["content"][0]["text"])

LiteRT-LM ใช้คำอธิบายเอกสารและคำแนะนำประเภทของฟังก์ชันเพื่อสร้างสคีมาเครื่องมือสำหรับโมเดล