Introducing LiteRT: Google's high-performance runtime for on-device AI, formerly known as TensorFlow Lite. Learn more

LiteRT delegate for NPUs

The Android ecosystem encompasses a wide range of devices with diverse neural processing units (NPUs). Leveraging these specialized NPUs can significantly accelerate LiteRT (TFLite) model inference and reduce energy consumption compared to CPU or GPU execution, enhancing the user experience in your applications.

Chip vendors who manufacture NPUs provide LiteRT delegates to allow your app to use their specific hardware on each user's device.

Qualcomm® AI Engine Direct Delegate

The Qualcomm® AI Engine Direct Delegate enables users to run LiteRT models using the AI Engine Direct runtime. The delegate is backed by Qualcomm's Neural Network API.

The Qualcomm® AI Engine Direct Delegate is available on Maven Central. For more information, see the Qualcomm Neural Network documentation.

Coming soon

We look forward to supporting delegates from the following vendors in the coming months:

Google Pixel
MediaTek
Samsung System LSI

Stay tuned for updates and further instructions on using these delegates to harness the power of NPUs in your TFLite models.