The Android ecosystem encompasses a wide range of devices with diverse neural processing units (NPUs). Leveraging these specialized NPUs can significantly accelerate LiteRT (TFLite) model inference and reduce energy consumption compared to CPU or GPU execution, enhancing the user experience in your applications.
LiteRT delegates are developed to allow your app to use the specific hardware on each user's device. These delegates are provided by the chip vendors who manufacture the NPUs.
Qualcomm Neural Network (QNN) Delegate
The Qualcomm Neural Network (QNN) delegate enables users to run LiteRT models using the Qualcomm AI Engine Direct runtime. The delegate is backed by Qualcomm's Neural Network API.
The QNN delegate is available on Maven Central. For more information, see the Qualcomm Neural Network documentation.
Coming soon
We look forward to supporting delegates from the following vendors in the coming months:
- Google Pixel
- MediaTek
- Samsung System LSI
Stay tuned for updates and further instructions on using these delegates to harness the power of NPUs in your TFLite models.