使用 LiteRT.js 的網頁版 LiteRT

LiteRT.js 是 Google 的高效能 WebAI 執行階段，適用於正式版網頁應用程式。這是 LiteRT 堆疊的延續，可確保多架構支援，並統一所有平台的核心執行階段。

LiteRT.js 支援下列核心功能：

支援在瀏覽器中執行 LiteRT 模型：在 CPU 上以一流效能執行模型，透過 WebAssembly (Wasm) 上的 XNNPack 加速，並使用 WebGPU API 執行 GPU。
相容於多種架構：使用您偏好的機器學習架構，例如 PyTorch、Jax 或 TensorFlow。
以現有管線為基礎建構：支援 TensorFlow.js 張量做為輸入和輸出，與現有 TensorFlow.js 管線整合。

安裝

從 npm 安裝 @litertjs/core 套件：

npm install @litertjs/core

Wasm 檔案位於 node_modules/@litertjs/core/wasm/。為方便起見，請複製並放送整個 wasm/ 資料夾。接著，匯入套件並載入 Wasm 檔案：

import {loadLiteRt} from '@litertjs/core;

// Host LiteRT's Wasm files on your server.
await loadLiteRt(`your/path/to/wasm/`);

模型轉換

LiteRT.js 使用與 Android 和 iOS 相同的 .tflite 格式，並支援 Kaggle 和 Huggingface 的現有模型。如果您有新的 PyTorch 模型，請先轉換模型。

將 PyTorch 模型轉換為 LiteRT

如要將 PyTorch 模型轉換為 LiteRT，請使用 ai-edge-torch 轉換工具。

import ai_edge_torch

# Load your torch model. We're using resnet for this example.
resnet18 = torchvision.models.resnet18(torchvision.models.ResNet18_Weights.IMAGENET1K_V1)

sample_inputs = (torch.randn(1, 3, 224, 224),)

# Convert the model to LiteRT.
edge_model = ai_edge_torch.convert(resnet18.eval(), sample_inputs)

# Export the model.
edge_model.export('resnet.tflite')

執行轉換後的模型

將模型轉換為 .tflite 檔案後，即可在瀏覽器中執行。

import {loadAndCompile} from '@litertjs/core';

// Load the model hosted from your server. This makes an http(s) request.
const model = await loadAndCompile('/path/to/model.tflite', {
    accelerator: 'webgpu', // or 'wasm' for XNNPack CPU inference
});
// The model can also be loaded from a Uint8Array if you want to fetch it yourself.

// Create image input data
const image = new Float32Array(224 * 224 * 3).fill(0);
const inputTensor =
    await new Tensor(image, /* shape */ [1, 3, 224, 224]).moveTo('webgpu');

// Run the model
const outputs = model(inputTensor);
// You can also use model([inputTensor])
// or model({'input_tensor_name': inputTensor})

// Clean up and get outputs
inputTensor.delete();
const outputTensorCpu = await outputs[0].moveTo('wasm');
const outputData = outputTensorCpu.toTypedArray();
outputTensorCpu.delete();

整合至現有的 TensorFlow.js 管線

基於下列原因，建議您將 LiteRT.js 整合至 TensorFlow.js 管道：

頂尖的 WebGPU 效能：在 LiteRT.js WebGPU 上執行的轉換模型已針對瀏覽器效能進行最佳化，在以 Chromium 為基礎的瀏覽器上速度特別快。
更簡單的模型轉換路徑：LiteRT.js 轉換路徑會直接從 PyTorch 轉換為 LiteRT。PyTorch 到 TensorFlow.js 的轉換路徑複雜許多，您必須從 PyTorch -> ONNX -> TensorFlow -> TensorFlow.js。
偵錯工具：LiteRT.js 轉換路徑隨附偵錯工具。

LiteRT.js 的設計宗旨是在 TensorFlow.js 管道中運作，並與 TensorFlow.js 前後處理程序相容，因此您只需要遷移模型本身。

按照下列步驟，將 LiteRT.js 整合至 TensorFlow.js 管道：

將原始 TensorFlow、JAX 或 PyTorch 模型轉換為 .tflite。詳情請參閱「模型轉換」一節。
安裝 @litertjs/core 和 @litertjs/tfjs-interop NPM 套件。
匯入並使用 TensorFlow.js WebGPU 後端。LiteRT.js 與 TensorFlow.js 互通時，必須使用此選項。
將載入 TensorFlow.js 模型的程式碼，替換為載入 LiteRT.js 模型的程式碼。
將 TensorFlow.js model.predict(inputs) 或 model.execute(inputs) 替換為 runWithTfjsTensors(liteRtModel, inputs)。runWithTfjsTensors 會採用與 TensorFlow.js 模型相同的輸入張量，並輸出 TensorFlow.js 張量。
測試模型管道是否輸出預期結果。

搭配 runWithTfjsTensors 使用 LiteRT.js 時，可能也需要對模型輸入內容進行下列變更：

重新排序輸入內容：視轉換工具排序模型輸入和輸出的方式而定，您可能需要變更傳遞輸入內容的順序。
轉置輸入內容：與 TensorFlow.js 使用的內容相比，轉換器也可能變更模型的輸入和輸出配置。您可能需要轉置輸入內容，以符合模型，並轉置輸出內容，以符合其餘的管道。
重新命名輸入內容：如果您使用具名輸入內容，名稱可能也會變更。

您可以使用 model.getInputDetails() 和 model.getOutputDetails() 進一步瞭解模型的輸入和輸出內容。