LiteRT.js is Google's high performance WebAI runtime, targeting production Web applications. It is a continuation of the LiteRT stack, ensuring multi-framework support and unifying our core runtime across all platforms.
LiteRT.js supports the following core features:
- In-browser hardware-accelerated inference: Run models with exceptional CPU performance, accelerated by XNNPack mapped to lightweight WebAssembly (Wasm). For GPU and dedicated hardware scaling (such as NPUs), LiteRT.js natively surfaces both the WebGPU API and the emerging WebNN API empowering fine grained platform-specific optimization.
- Multi-framework compatibility: Streamline development semantics by compiling from your preferred ML Framework natively: PyTorch, JAX or TensorFlow.
- Iterate on existing pipelines: Out-the-box integration with existing TensorFlow.js architectures by parsing natively supported TensorFlow.js Tensors as direct boundary inputs and outputs.
Installation
Install the @litertjs/core package from npm:
npm install @litertjs/core
The Wasm files are located in node_modules/@litertjs/core/wasm/. For
convenience, copy and serve the entire wasm/ folder. Then, import the package
and load the Wasm files:
import {loadLiteRt} from '@litertjs/core';
// Load the LiteRT.js Wasm files from a CDN.
await loadLiteRt('https://cdn.jsdelivr.net/npm/@litertjs/core/wasm/')
// Alternatively, host them from your server.
// They are located in node_modules/@litertjs/core/wasm/
await loadLiteRt(`your/path/to/wasm/`);
Model conversion
LiteRT.js uses the same .tflite format as the rest of the LiteRT ecosystem,
and it supports existing models on
Kaggle and
Huggingface. If
you have a new PyTorch model, you'll need to convert it.
Convert a PyTorch Model to LiteRT
To convert a PyTorch model to LiteRT, use the litert-torch converter.
import litert_torch
# Load your torch model. We're using resnet for this example.
resnet18 = torchvision.models.resnet18(torchvision.models.ResNet18_Weights.IMAGENET1K_V1)
sample_inputs = (torch.randn(1, 3, 224, 224),)
# Convert the model to LiteRT.
edge_model = litert_torch.convert(resnet18.eval(), sample_inputs)
# Export the model.
edge_model.export('resnet.tflite')
Run the Converted Model
After converting the model to a .tflite file, you can run it in the browser.
import {loadAndCompile} from '@litertjs/core';
// Load the model hosted from your server. This makes an http(s) request.
const model = await loadAndCompile('/path/to/model.tflite', {
accelerator: 'webgpu',
// Can select from 'webnn', 'webgpu', & 'wasm'.
// Additionally, you can pass an array of accelerators e.g. ['webnn', 'wasm']
// if you would like to fallback to CPU execution,
// Note that ONLY cpu fallback is supported for now
// (i.e. specifying ['webnn', 'webgpu']) will lead to compilation errors
});
// The model can also be loaded from a Uint8Array if you want to fetch it yourself.
// Create image input data
const image = new Float32Array(224 * 224 * 3).fill(0);
const inputTensor = new Tensor(image, /* shape */ [1, 3, 224, 224]);
// Run the model
const outputs = await model.run(inputTensor);
// You can also use `await model.run([inputTensor]);`
// or `await model.run({'input_tensor_name': inputTensor});`
// Clean up and get outputs
inputTensor.delete();
const output = outputs[0];
const outputData = await output.data();
output.delete();
Integrate into existing TensorFlow.js pipelines
You should consider integrating LiteRT.js into your TensorFlow.js pipelines for the following reasons:
- Exceptional GPU & Hardware Performance: LiteRT.js models leverage WebGPU acceleration for optimized performance across browser architectures. With support for WebGPU and upcoming WebNN, LiteRT.js offers flexible hardware acceleration across a variety of edge devices.
- Easier Model Conversion Path: The LiteRT.js conversion path goes directly from PyTorch to LiteRT. The PyTorch to TensorFlow.js conversion path is significantly more complicated, requiring you to go from PyTorch -> ONNX -> TensorFlow -> TensorFlow.js.
- Debugging tools: The LiteRT.js conversion path comes with debugging tools.
LiteRT.js is designed to function within TensorFlow.js pipelines, and is compatible with TensorFlow.js pre- and post-processing, so the only thing you need to migrate is the model itself.
Integrate LiteRT.js into TensorFlow.js pipelines with the following steps:
- Convert your original TensorFlow, JAX, or PyTorch model to
.tflite. For details, see the model conversion section. - Install the
@litertjs/coreand@litertjs/tfjs-interopNPM packages. - Import and use the TensorFlow.js WebGPU backend. This is required for LiteRT.js to interoperate with TensorFlow.js.
- Replace loading the TensorFlow.js model with loading the LiteRT.js model.
- Substitute the TensorFlow.js
model.predict(inputs) ormodel.execute(inputs) withrunWithTfjsTensors(liteRtModel, inputs).runWithTfjsTensorstakes the same input tensors that TensorFlow.js models use and outputs TensorFlow.js tensors. - Test that the model pipeline outputs the results you expect.
Using LiteRT.js with runWithTfjsTensors may also require the following changes
to the model inputs:
- Reorder inputs: Depending on how the converter ordered the inputs and outputs of the model, you may need to change their order as you pass them in.
- Transpose inputs: It's also possible that the converter changed the layout of the inputs and outputs of the model compared to what TensorFlow.js uses. You may need to transpose your inputs to match the model and outputs to match the rest of the pipeline.
- Rename inputs: If you're using named inputs, the names may have also changed.
You can get more information about the inputs and outputs of the model with
model.getInputDetails() and model.getOutputDetails().