Production-ready, open-source inference framework designed to deliver high-performance, cross-platform LLM deployments on edge devices.
On-device GenAI in Chrome, Chromebook Plus, and Pixel Watch with LiteRT-LM.

Why LiteRT-LM?

Deploy LLMs across Android, iOS, Web, and Desktop.
Maximize performance with GPU and NPU acceleration.
Support for popular LLMs as well as multi-modality (Vision, Audio) and Tool Use.

Start building

Python APIs with hardware acceleration on Linux, MacOS, Windows, and Raspberry Pi.
Native Android apps and JVM-based desktop tools.
Native iOS and macOS integration with specialized Metal support (Swift APIs coming soon).
x-platform C++ APIs .

Blogs and Announcements

Deploy language models on wearables and browser-based platforms using LiteRT-LM at scale.
Explore how to fine-tune FunctionGemma and enable function calling capabilities powered by LiteRT-LM Tool Use APIs.
Latest insights on RAG, multimodality, and function calling for edge language models.

Join the Community

Contribute to the open-source project, report issues, and see examples.
Download pre-converted models and join the discussion.