رابط برنامه‌نویسی وب LiteRT-LM

رابط برنامه‌نویسی کاربردی وب LiteRT-LM برای جاوا اسکریپت و تایپ‌اسکریپت در مرورگر. این یک پیش‌نمایش اولیه است که از اجرای ورودی/خروجی متن در WebGPU پشتیبانی می‌کند.

مدل‌های پشتیبانی‌شده

رابط برنامه‌نویسی کاربردی LiteRT-LM JS در حال حاضر از مجموعه محدودی از مدل‌های سازگار با وب پشتیبانی می‌کند. ما در حال تلاش برای گسترش این قابلیت به منظور پوشش فایل‌های مدل عمومی .litertlm هستیم، اما در حال حاضر، مدل‌های زیر پشتیبانی می‌شوند:

gemma-4-E2B-it-web.litertlm از litert-community/gemma-4-E2B-it-litert-lm
gemma-4-E4B-it-web.litertlm از litert-community/gemma-4-E4B-it-litert-lm

مقدمه

در اینجا یک نمونه برنامه چت REPL ساخته شده با API جاوا اسکریپت آورده شده است:

<div id="out" style="white-space: pre-wrap; font-family: monospace;"></div>
<input id="in" onkeydown="if(event.key === 'Enter') repl(this)">

<script type="module">
  import { Engine } from 'https://cdn.jsdelivr.net/npm/@litert-lm/core/+esm';
  const engine = await Engine.create({
    // Load the Gemma 4 E2B model
    model: 'https://huggingface.co/litert-community/gemma-4-E2B-it-litert-lm/resolve/main/gemma-4-E2B-it-web.litertlm'
    // Or use the E4B model by swapping in this line
    // model: 'https://huggingface.co/litert-community/gemma-4-E4B-it-litert-lm/resolve/main/gemma-4-E4B-it-web.litertlm'
  });
  const chat = await engine.createConversation();

  window.repl = async (el) => {
    const text = el.value;
    el.value = ''; // Clear immediately
    out.append(`\n>>> ${text}\nAI: `);

    for await (const chunk of chat.sendMessageStreaming(text)) {
      out.append(chunk.content[0].text);
    }
  };
</script>

شروع کار

LiteRT-LM به عنوان یک بسته npm در دسترس است. می‌توانید آخرین نسخه را از npm نصب کنید یا مستقیماً آن را از CDN وارد کنید:

# From npm
npm i --save @litert-lm/core

# From a CDN (in your JavaScript file)
import * as litertlm from 'https://cdn.jsdelivr.net/npm/@litert-lm/core/+esm';

موتور را مقداردهی اولیه کنید

Engine نقطه ورود به API است. این موتور بارگذاری مدل، ایجاد جلسه (session) و مدیریت منابع را مدیریت می‌کند. به یاد داشته باشید که وقتی دیگر به مدل نیازی نیست، موتور را delete تا منابع آزاد شوند.

توجه: مقداردهی اولیه موتور برای بارگذاری مدل می‌تواند چند ثانیه طول بکشد.

import {Engine, EngineSettings} from '@litert-lm/core';

const engineSettings = {
  model: 'url/path/to/model.litertlm', // or a ReadableStream, or a Blob

  // You can configure context length and other settings here
  mainExecutorSettings: {
    maxNumTokens: 8192,
  },
} satisfies EngineSettings;

const engine = await Engine.create(engineSettings);

// ... Use the engine to create a conversation ...

// Delete the engine when done.
await engine.delete();

ایجاد مکالمه

پس از راه‌اندازی اولیه موتور، یک نمونه Conversation ایجاد کنید. می‌توانید یک ConversationConfig برای سفارشی‌سازی رفتار آن ارائه دهید.

const conversation = await engine.createConversation({
  preface: {
    messages: [
      {role: 'system', content: 'You are a helpful assistant'}
    ]
  }
});

conversation.sendMessage({
  role: 'user',
  content: 'Write a poem',
});

ارسال پیام

شما می‌توانید پیام‌ها را با یا بدون استریمینگ ارسال کنید.

مثال غیر استریمینگ

// Simple string input
let response = await conversation.sendMessage("What is the capital of France?");
console.log(response.content[0].text);

// Or with full message structure
response = await conversation.sendMessage({role: 'user', content: '...'});

مثال استریمینگ

// sendMessageStreaming returns a ReadableStream of response chunks
const stream = conversation.sendMessageStreaming('Tell me a long story.');

for await (const chunk of stream) {
  // Chunks are Records containing pieces of the response
  for (const item of chunk.content) {
    if (item.type === 'text') {
      console.log(item.text);
    }
  }
}

لغو تولید

شما می‌توانید با فراخوانی تابع cancel() در نمونه‌ی Conversation تولید یک مکالمه‌ی در حال انجام را به صراحت لغو کنید:

// Cancel any ongoing generation
conversation.cancel();

اگر در حال پخش پاسخ هستید، خروج زودهنگام for await...of (مانند break ) نیز به طور خودکار تولید مداوم را لغو می‌کند:

for await (const chunk of stream) {
  if (shouldStop()) {
    break; // Cancels the stream and underlying generation
  }
}