Efficient conversion, runtime, and optimization for on-device machine learning.
LiteRT isn't just new; it's the next generation of the world's most widely deployed machine learning runtime. It powers the apps you use every day, delivering low latency and high privacy on billions of devices.

Trusted by the most critical Google apps

100K+ applications, billions of global users

LiteRT Highlights

Deploy via LiteRT

Streamline your deep learning workflow from training to on-device deployment.
Use .tflite pre-trained models or convert PyTorch, JAX or TensorFlow models to .tflite.
Use the LiteRT optimization toolkit to quantize your models post-training.
Deploy your model with LiteRT and pick the optimal accelerator for your app.

Choose Your Development Path

Use LiteRT to deploy AI anywhere—from high-performance mobile apps to resource-constrained IoT devices.
Transitioning to LiteRT to leverage enhanced performance and unified APIs across platforms (Android, Desktop, Web).
Have a PyTorch model, looking to implement on-device vision or audio experiences.
Creating sophisticated on-device chatbots using optimized open-weight GenAI models like Gemma or another open-weight model.
Authoring custom models or performing deep hardware-specific CPU/GPU/NPU optimizations for peak performance.

Samples, models, and demo

Complete, end-to-end sample apps.
Pre-trained, out-of-the-box Gen AI models.
A gallery that showcases on-device ML/GenAI use cases using LiteRT.

Blogs and Announcements

Stay up to date with the latest announcements, technical deep dives, and performance benchmarks from the LiteRT team.
Google's unified on-device ML framework, evolving from TFLite for high-performance deployment.
Expanding NPU acceleration support to MediaTek chipsets for high-efficiency AI.
Unlocking breakthrough performance for generative AI on Qualcomm Neural Processing Units.
Introducing the CompiledModel API for automated hardware selection and async execution.
Deploy language models on wearables and browser-based platforms using LiteRT-LM.
Latest insights on RAG, multimodality, and function calling for edge language models

Join the Community

Contribute directly to the project and collaborate with core developers.
Access optimized open-weight models on the Hugging Face Hub.
Ready to take your on-device ML to the next level? Explore the documentation and start building today.