TensorFlow Lite overview

TensorFlow Lite enables on-device machine learning (ODML) for mobile and embedded devices. You can find ready-to-run TensorFlow Lite models for a wide range of ML/AI tasks, or convert and run TensorFlow, PyTorch, and JAX models to the TFLite format using the AI Edge conversion and optimization tools.

Key features

  • Optimized for on-device machine learning: TensorFlow Lite addresses five key ODML constraints: latency (there's no round-trip to a server), privacy (no personal data leaves the device), connectivity (internet connectivity is not required), size (reduced model and binary size) and power consumption (efficient inference and a lack of network connections).

  • Multi-platform support: Compatible with Android and iOS devices, embedded Linux, and microcontrollers.

  • Multi-framework model options: AI Edge provides tools to convert models from TensorFlow, PyTorch, and JAX models into the FlatBuffers format (.tflite), enabling you to use a wide range of state-of-the-art models on TF Lite. You also have access to model optimization tools that can handle quantization and metadata.

  • Diverse language support: Includes SDKs for Java/Kotlin, Swift, Objective-C, C++, and Python.

  • High performance: Hardware acceleration through specialized delegates like GPU and iOS Core ML.

Development workflow

The TensorFlow Lite development workflow involves identifying an ML/AI problem, choosing a model that solves that problem, and implementing the model on-device. The following steps walk you through the workflow and provides links to further instructions.

1. Identify the most suitable solution to the ML problem

TensorFlow Lite offers users a high level of flexibility and customizability when it comes to solving machine learning problems, making it a good fit for users who require a specific model or a specialized implementation. Users looking for plug-and-play solutions may prefer MediaPipe Tasks, which provides ready-made solutions for common machine learning tasks like object detection, text classification, and LLM inference.

Choose one of the following AI Edge frameworks:

  • TensorFlow Lite: Flexible and customizable runtime that can run a wide range of models. Choose a model for your use case, convert it to the TensorFlow Lite format (if necessary), and run it on-device. If you intend to use TensorFlow Lite, keep reading.
  • MediaPipe Tasks: Plug-and-play solutions with default models that allow for customization. Choose the task that solves your AI/ML problem, and implement it on multiple platforms. If you intend to use MediaPipe Tasks, refer to the MediaPipe Tasks documentation.

2. Choose a model

A TensorFlow Lite model is represented in an efficient portable format known as FlatBuffers, which uses the .tflite file extension.

You can use a TensorFlow Lite model in the following ways:

  • Use an existing TensorFlow Lite model: The simplest approach is to use a TensorFlow Lite model already in the .tflite format. These models do not require any added conversion steps. You can find TensorFlow Lite models on Kaggle Models.

  • Convert a model into a TensorFlow Lite model: You can use the TensorFlow Converter, PyToch Converter, or JAX converter to convert models to the FlatBuffers format (.tflite) and run them in TensorFlow Lite. To get started, you can find models on the following sites:

A TensorFlow Lite model can optionally include metadata that contains human-readable model descriptions and machine-readable data for automatic generation of pre- and post-processing pipelines during on-device inference. Refer to Add metadata for more details.

3. Integrate the model into your app

You can implement your TensorFlow Lite models to run inferences completely on-device on web, embedded, and mobile devices. TensorFlow Lite contains APIs for Python, Java and Kotlin for Android, Swift for iOS, and C++ for micro-devices.

Use the following guides to implement a TensorFlow Lite model on your preferred platform:

  • Run on Android: Run models on Android devices using the Java/Kotlin APIs.
  • Run on iOS: Run models on iOS devices using the Swift APIs.
  • Run on Micro: Run models on embedded devices using the C++ APIs.

On Android and iOS devices, you can improve performance using hardware acceleration. On either platform you can use a GPU Delegate, and on iOS you can use the Core ML Delegate. To add support for new hardware accelerators, you can define your own delegate.

You can run inference in the following ways based on the model type:

Next steps

New users should get started with the TensorFlow Lite quickstart. For specific information, see the following sections:

Model conversion

Platform guides