隆重推出 Google AI Edge Portal：大规模对边缘 AI 进行基准测试。注册以在非公开预览期间申请访问权限。

Google Play 服务 Java (和 Kotlin) API 中的 LiteRT

除了 Native API 之外，还可以使用 Java API 访问 Google Play 服务中的 LiteRT，这些 Java API 可用于 Java 或 Kotlin 代码。具体而言，Google Play 服务中的 LiteRT 可通过 LiteRT 解释器 API 来使用。

使用 Interpreter API

TensorFlow 运行时提供的 LiteRT 解释器 API 提供了一个用于构建和运行机器学习模型的通用接口。按照以下步骤操作，即可使用 Google Play 服务中的 TensorFlow Lite 运行时通过 Interpreter API 运行推理。

1. 添加项目依赖项

将以下依赖项添加到您的应用项目代码中，以访问 LiteRT 的 Play 服务 API：

dependencies {
...
    // LiteRT dependencies for Google Play services
    implementation 'com.google.android.gms:play-services-tflite-java:16.1.0'
    // Optional: include LiteRT Support Library
    implementation 'com.google.android.gms:play-services-tflite-support:16.1.0'
...
}

2. 添加了 LiteRT 的初始化

在之前使用 LiteRT API 时，初始化 Google Play 服务 API 的 LiteRT 组件：

Kotlin

val initializeTask: Task<Void> by lazy { TfLite.initialize(this) }

Java

Task<Void> initializeTask = TfLite.initialize(context);

3. 创建解释器并设置运行时选项

使用 InterpreterApi.create() 创建解释器，并通过调用 InterpreterApi.Options.setRuntime() 将其配置为使用 Google Play 服务运行时，如以下示例代码所示：

Kotlin

import org.tensorflow.lite.InterpreterApi
import org.tensorflow.lite.InterpreterApi.Options.TfLiteRuntime
...
private lateinit var interpreter: InterpreterApi
...
initializeTask.addOnSuccessListener {
  val interpreterOption =
    InterpreterApi.Options().setRuntime(TfLiteRuntime.FROM_SYSTEM_ONLY)
  interpreter = InterpreterApi.create(
    modelBuffer,
    interpreterOption
  )}
  .addOnFailureListener { e ->
    Log.e("Interpreter", "Cannot initialize interpreter", e)
  }

Java

import org.tensorflow.lite.InterpreterApi
import org.tensorflow.lite.InterpreterApi.Options.TfLiteRuntime
...
private InterpreterApi interpreter;
...
initializeTask.addOnSuccessListener(a -> {
    interpreter = InterpreterApi.create(modelBuffer,
      new InterpreterApi.Options().setRuntime(TfLiteRuntime.FROM_SYSTEM_ONLY));
  })
  .addOnFailureListener(e -> {
    Log.e("Interpreter", String.format("Cannot initialize interpreter: %s",
          e.getMessage()));
  });

您应使用上述实现，因为它可以避免阻塞 Android 界面线程。如果您需要更密切地管理线程执行，可以向解释器创建添加 Tasks.await() 调用：

Kotlin

import androidx.lifecycle.lifecycleScope
...
lifecycleScope.launchWhenStarted { // uses coroutine
  initializeTask.await()
}

Java

@BackgroundThread
InterpreterApi initializeInterpreter() {
    Tasks.await(initializeTask);
    return InterpreterApi.create(...);
}

4. 运行推理

使用您创建的 interpreter 对象，调用 run() 方法以生成推理结果。

Kotlin

interpreter.run(inputBuffer, outputBuffer)

Java

interpreter.run(inputBuffer, outputBuffer);

硬件加速

借助 LiteRT，您可以使用专用硬件处理器（例如图形处理单元 [GPU]）来提升模型的性能。您可以使用称为“委托”的硬件驱动程序来利用这些专用处理器。

GPU 委托通过 Google Play 服务提供，并且会动态加载，就像 Interpreter API 的 Play 服务版本一样。

检查设备兼容性

并非所有设备都支持使用 TFLite 进行 GPU 硬件加速。为了减少错误和潜在的崩溃，请使用 TfLiteGpu.isGpuDelegateAvailable 方法检查设备是否与 GPU 委托兼容。

使用此方法可确认设备是否与 GPU 兼容，并在不支持 GPU 时使用 CPU 作为后备。

useGpuTask = TfLiteGpu.isGpuDelegateAvailable(context)

获得 useGpuTask 等变量后，您可以使用它来确定设备是否使用 GPU 委托。

Kotlin

val interpreterTask = useGpuTask.continueWith { task ->
  val interpreterOptions = InterpreterApi.Options()
      .setRuntime(TfLiteRuntime.FROM_SYSTEM_ONLY)
  if (task.result) {
      interpreterOptions.addDelegateFactory(GpuDelegateFactory())
  }
  InterpreterApi.create(FileUtil.loadMappedFile(context, MODEL_PATH), interpreterOptions)
}

Java

Task<InterpreterApi.Options> interpreterOptionsTask = useGpuTask.continueWith({ task ->
  InterpreterApi.Options options =
      new InterpreterApi.Options().setRuntime(TfLiteRuntime.FROM_SYSTEM_ONLY);
  if (task.getResult()) {
     options.addDelegateFactory(new GpuDelegateFactory());
  }
  return options;
});

使用 Interpreter API 的 GPU

如需将 GPU 委托与 Interpreter API 搭配使用，请执行以下操作：

更新项目依赖项以使用 Play 服务中的 GPU 委托：

implementation 'com.google.android.gms:play-services-tflite-gpu:16.2.0'

在 TFlite 初始化中启用 GPU 委托选项：

Kotlin

TfLite.initialize(context,
  TfLiteInitializationOptions.builder()
    .setEnableGpuDelegateSupport(true)
    .build())

Java

TfLite.initialize(context,
  TfLiteInitializationOptions.builder()
    .setEnableGpuDelegateSupport(true)
    .build());

在解释器选项中启用 GPU 代理：通过调用 addDelegateFactory() withinInterpreterApi.Options()` 将代理工厂设置为 GpuDelegateFactory：

Kotlin

val interpreterOption = InterpreterApi.Options()
  .setRuntime(TfLiteRuntime.FROM_SYSTEM_ONLY)
  .addDelegateFactory(GpuDelegateFactory())

Java

Options interpreterOption = InterpreterApi.Options()
  .setRuntime(TfLiteRuntime.FROM_SYSTEM_ONLY)
  .addDelegateFactory(new GpuDelegateFactory());

从独立 LiteRT 迁移

如果您计划将应用从独立 LiteRT 迁移到 Play 服务 API，请查看以下有关更新应用项目代码的其他指南：

查看限制部分，确保您的使用场景受支持。
在更新代码之前，我们建议您对模型进行性能和准确性检查，尤其是在使用低于 2.1 版本的 LiteRT (TF Lite) 时，以便获得基准值，用于与新实现进行比较。
如果您已将所有代码迁移为使用 LiteRT 的 Play 服务 API，则应从 build.gradle 文件中移除现有的 LiteRT 运行时库依赖项（带有 org.tensorflow:tensorflow-lite:* 的条目），以便缩减应用大小。
找出代码中所有创建 new Interpreter 对象的位置，并修改每个位置，使其使用 InterpreterApi.create() 调用。新的 TfLite.initialize 是异步的，这意味着在大多数情况下，它不是直接替换项：您必须注册一个监听器，以便在调用完成时收到通知。请参阅第 3 步代码中的代码段。
使用 org.tensorflow.lite.Interpreter 或 org.tensorflow.lite.InterpreterApi 类将 import org.tensorflow.lite.InterpreterApi; 和 import org.tensorflow.lite.InterpreterApi.Options.TfLiteRuntime; 添加到任何源文件。
如果对 InterpreterApi.create() 的任何生成的调用只有一个实参，请将 new InterpreterApi.Options() 附加到实参列表。
将 .setRuntime(TfLiteRuntime.FROM_SYSTEM_ONLY) 附加到对 InterpreterApi.create() 的任何调用的最后一个实参。
将 org.tensorflow.lite.Interpreter 类的所有其他出现次数替换为 org.tensorflow.lite.InterpreterApi。

如果您想同时使用独立 LiteRT 和 Play 服务 API，则必须使用 LiteRT (TF Lite) 2.9 版或更高版本。LiteRT (TF Lite) 版本 2.8 及更低版本与 Play 服务 API 版本不兼容。