Ti presentiamo Google AI Edge Portal: esegui il benchmarking dell'IA di Edge su larga scala. Registrati per richiedere l'accesso durante l'anteprima privata.

Google uses AI technology to translate content into your preferred language. AI translations can contain errors.

API web LiteRT-LM

L'API web di LiteRT-LM per JavaScript e TypeScript nel browser. Questa è un'anteprima che supporta l'esecuzione di testo in entrata / uscita in WebGPU.

Modelli supportati

L'API LiteRT-LM JS attualmente supporta un insieme limitato di modelli compatibili con il web. Stiamo lavorando per estendere questa funzionalità ai file di modelli .litertlm generici, ma per ora sono supportati i seguenti modelli:

gemma-4-E2B-it-web.litertlm da litert-community/gemma-4-E2B-it-litert-lm
gemma-4-E4B-it-web.litertlm da litert-community/gemma-4-E4B-it-litert-lm

Introduzione

Ecco un'app di chat REPL di esempio creata con l'API JavaScript:

<div id="out" style="white-space: pre-wrap; font-family: monospace;"></div>
<input id="in" onkeydown="if(event.key === 'Enter') repl(this)">

<script type="module">
  import { Engine } from 'https://cdn.jsdelivr.net/npm/@litert-lm/core/+esm';
  const engine = await Engine.create({
    // Load the Gemma 4 E2B model
    model: 'https://huggingface.co/litert-community/gemma-4-E2B-it-litert-lm/resolve/main/gemma-4-E2B-it-web.litertlm'
    // Or use the E4B model by swapping in this line
    // model: 'https://huggingface.co/litert-community/gemma-4-E4B-it-litert-lm/resolve/main/gemma-4-E4B-it-web.litertlm'
  });
  const chat = await engine.createConversation();

  window.repl = async (el) => {
    const text = el.value;
    el.value = ''; // Clear immediately
    out.append(`\n>>> ${text}\nAI: `);

    for await (const chunk of chat.sendMessageStreaming(text)) {
      out.append(chunk.content[0].text);
    }
  };
</script>

Per iniziare

LiteRT-LM è disponibile come pacchetto npm. Puoi installare l'ultima versione da npm o importarla direttamente da una CDN:

# From npm
npm i --save @litert-lm/core

# From a CDN (in your JavaScript file)
import * as litertlm from 'https://cdn.jsdelivr.net/npm/@litert-lm/core/+esm';

Inizializzare il motore

Engine è il punto di ingresso dell'API. Gestisce il caricamento dei modelli, la creazione delle sessioni e la gestione delle risorse. Ricordati di delete il motore per rilasciare le risorse quando il modello non è più necessario.

Nota:l'inizializzazione del motore può richiedere diversi secondi per caricare il modello.

import {Engine, EngineSettings} from '@litert-lm/core';

const engineSettings = {
  model: 'url/path/to/model.litertlm', // or a ReadableStream, or a Blob

  // You can configure context length and other settings here
  mainExecutorSettings: {
    maxNumTokens: 8192,
  },
} satisfies EngineSettings;

const engine = await Engine.create(engineSettings);

// ... Use the engine to create a conversation ...

// Delete the engine when done.
await engine.delete();

Creare una conversazione

Una volta inizializzato il motore, crea un'istanza Conversation. Puoi fornire un ConversationConfig per personalizzarne il comportamento.

const conversation = await engine.createConversation({
  preface: {
    messages: [
      {role: 'system', content: 'You are a helpful assistant'}
    ]
  }
});

conversation.sendMessage({
  role: 'user',
  content: 'Write a poem',
});

Invio messaggi

Puoi inviare messaggi con o senza streaming.

Esempio non di streaming

// Simple string input
let response = await conversation.sendMessage("What is the capital of France?");
console.log(response.content[0].text);

// Or with full message structure
response = await conversation.sendMessage({role: 'user', content: '...'});

Esempio di streaming

// sendMessageStreaming returns a ReadableStream of response chunks
const stream = conversation.sendMessageStreaming('Tell me a long story.');

for await (const chunk of stream) {
  // Chunks are Records containing pieces of the response
  for (const item of chunk.content) {
    if (item.type === 'text') {
      console.log(item.text);
    }
  }
}

Annulla generazione

Puoi annullare una generazione in corso chiamando esplicitamente cancel() sull'istanza Conversation:

// Cancel any ongoing generation
conversation.cancel();

Se stai riproducendo in streaming la risposta, l'uscita anticipata dal ciclo for await...of (ad esempio con break) annullerà automaticamente anche la generazione in corso:

for await (const chunk of stream) {
  if (shouldStop()) {
    break; // Cancels the stream and underlying generation
  }
}