Google AI Edge Portal ile tanışın: Edge AI'yı geniş ölçekte karşılaştırın. Gizli önizleme sırasında erişim isteğinde bulunmak için kaydolun.

Google uses AI technology to translate content into your preferred language. AI translations can contain errors.

LiteRT-LM Python API'si

Linux ve macOS için LiteRT-LM'nin Python API'si (Windows desteği yakında kullanıma sunulacaktır). Çoklu format, araç kullanımı ve GPU hızlandırma gibi özellikler desteklenir.

Giriş

Python API ile oluşturulmuş örnek bir terminal sohbet uygulaması:

import litert_lm

litert_lm.set_min_log_severity(litert_lm.LogSeverity.ERROR) # Hide log for TUI app

with litert_lm.Engine("path/to/model.litertlm") as engine:
  with engine.create_conversation() as conversation:
    while True:
      user_input = input("\n>>> ")
      for chunk in conversation.send_message_async(user_input):
        print(chunk["content"][0]["text"], end="", flush=True)

Başlarken

LiteRT-LM, Python kitaplığı olarak kullanılabilir. Gece sürümünü PyPI'den yükleyebilirsiniz:

# Using pip
pip install litert-lm-api-nightly

# Using uv
uv pip install litert-lm-api-nightly

Engine'i başlatma

Engine, API'nin giriş noktasıdır. Model yükleme ve kaynak yönetimini ele alır. Bunu bağlam yöneticisi olarak kullanmak (with ifadesiyle) yerel kaynakların zamanında serbest bırakılmasını sağlar.

Not: Motorun başlatılması, modelin yüklenmesi için birkaç saniye sürebilir.

import litert_lm

# Initialize with the model path and optionally specify the backend.
# backend can be Backend.CPU (default) or Backend.GPU.
with litert_lm.Engine(
    "path/to/your/model.litertlm",
    backend=litert_lm.Backend.GPU,
    # Optional: Pick a writable dir for caching compiled artifacts.
    # cache_dir="/tmp/litert-lm-cache"
) as engine:
    # ... Use the engine to create a conversation ...
    pass

Görüşme oluşturma

Conversation, modelle etkileşiminizin durumunu ve geçmişini yönetir.

# Optional: Configure system instruction and initial messages
messages = [
    {"role": "system", "content": [{"type": "text", "text": "You are a helpful assistant."}]},
]

# Create the conversation
with engine.create_conversation(messages=messages) as conversation:
    # ... Interact with the conversation ...
    pass

Mesaj gönderme

Mesajları eşzamanlı veya eşzamansız (akış) olarak gönderebilirsiniz.

Zaman uyumlu örnek:

# Simple string input
response = conversation.send_message("What is the capital of France?")
print(response["content"][0]["text"])

# Or with full message structure
# response = conversation.send_message({"role": "user", "content": "..."})

Eşzamansız (Akış) Örneği:

# sendMessageAsync returns an iterator of response chunks
stream = conversation.send_message_async("Tell me a long story.")
for chunk in stream:
    # Chunks are dictionaries containing pieces of the response
    for item in chunk.get("content", []):
      if item.get("type") == "text":
        print(item["text"], end="", flush=True)
print()

🔴 Yeni: Çok Jetonlu Tahmin (MTP)

Çoklu jeton tahmini (MTP), kod çözme hızlarını önemli ölçüde artıran bir performans optimizasyonudur. MTP, GPU arka uçlarındaki tüm görevler için evrensel olarak önerilir.

MTP'yi kullanmak için motoru başlatırken spekülatif kod çözmeyi etkinleştirin.

import litert_lm

# Enable MTP by setting enable_speculative_decoding=True
with litert_lm.Engine(
    "path/to/your/model.litertlm",
    backend=litert_lm.Backend.GPU,
    enable_speculative_decoding=True,
) as engine:
    with engine.create_conversation() as conversation:
        response = conversation.send_message("What is the capital of France?")
        print(response["content"][0]["text"])

Çok Formatlılık

# Initialize with vision and/or audio backends if needed
with litert_lm.Engine(
    "path/to/multimodal_model.litertlm",
    audio_backend=litert_lm.Backend.CPU,
    vision_backend=litert_lm.Backend.GPU,
) as engine:
    with engine.create_conversation() as conversation:
        user_message = {
            "role": "user",
            "content": [
                {"type": "audio", "path": "/path/to/audio.wav"},
                {"type": "text", "text": "Describe this audio."},
            ],
        }
        response = conversation.send_message(user_message)
        print(response["content"][0]["text"])

Araçları Tanımlama ve Kullanma

Python işlevlerini, modelin otomatik olarak çağırabileceği araçlar olarak tanımlayabilirsiniz.

def add_numbers(a: float, b: float) -> float:
    """Adds two numbers.

    Args:
        a: The first number.
        b: The second number.
    """
    return a + b

# Register the tool in the conversation
tools = [add_numbers]
with engine.create_conversation(tools=tools) as conversation:
    # The model will call add_numbers automatically if it needs to sum values
    response = conversation.send_message("What is 123 + 456?")
    print(response["content"][0]["text"])

LiteRT-LM, model için araç şemasını oluşturmak üzere işlevin docstring'ini ve tür ipuçlarını kullanır.