隆重推出 LiteRT：Google 為裝置端 AI (舊稱 TensorFlow Lite) 打造的高效能執行階段。

本頁面由 Cloud Translation API 翻譯而成。

Google Play 服務 Java (和 Kotlin) API 中的 LiteRT

除了原生 API 外，您也可以使用 Java API 存取 Google Play 服務中的 LiteRT，這些 API 可透過 Java 或 Kotlin 程式碼使用。具體來說，Google Play 服務中的 LiteRT 可透過 LiteRT 解譯器 API 使用。

使用解譯器 API

TensorFlow 執行階段提供的 LiteRT 解譯器 API，可提供建構及執行機器學習模型的通用介面。請按照下列步驟，使用 Google Play 服務執行階段中的 TensorFlow Lite 執行推論作業，透過 Interpreter API 執行推論。

1. 新增專案依附元件

在應用程式專案程式碼中加入下列依附元件，即可存取 LiteRT 適用的 Play 服務 API：

dependencies {
...
    // LiteRT dependencies for Google Play services
    implementation 'com.google.android.gms:play-services-tflite-java:16.1.0'
    // Optional: include LiteRT Support Library
    implementation 'com.google.android.gms:play-services-tflite-support:16.1.0'
...
}

2. 新增 LiteRT 的初始化

使用 LiteRT API「之前」，先初始化 Google Play 服務 API 的 LiteRT 元件：

KotlinJava

val initializeTask: Task<Void> by lazy { TfLite.initialize(this) }

Task<Void> initializeTask = TfLite.initialize(context);

3. 建立轉譯器並設定執行階段選項

使用 InterpreterApi.create() 建立轉譯器，並呼叫 InterpreterApi.Options.setRuntime() 將其設定為使用 Google Play 服務執行階段，如以下程式碼範例所示：

KotlinJava

import org.tensorflow.lite.InterpreterApi
import org.tensorflow.lite.InterpreterApi.Options.TfLiteRuntime
...
private lateinit var interpreter: InterpreterApi
...
initializeTask.addOnSuccessListener {
  val interpreterOption =
    InterpreterApi.Options().setRuntime(TfLiteRuntime.FROM_SYSTEM_ONLY)
  interpreter = InterpreterApi.create(
    modelBuffer,
    interpreterOption
  )}
  .addOnFailureListener { e ->
    Log.e("Interpreter", "Cannot initialize interpreter", e)
  }

import org.tensorflow.lite.InterpreterApi
import org.tensorflow.lite.InterpreterApi.Options.TfLiteRuntime
...
private InterpreterApi interpreter;
...
initializeTask.addOnSuccessListener(a -> {
    interpreter = InterpreterApi.create(modelBuffer,
      new InterpreterApi.Options().setRuntime(TfLiteRuntime.FROM_SYSTEM_ONLY));
  })
  .addOnFailureListener(e -> {
    Log.e("Interpreter", String.format("Cannot initialize interpreter: %s",
          e.getMessage()));
  });

您應使用上述實作方式，因為這可避免封鎖 Android 使用者介面執行緒。如果需要更密切地管理執行緒執行作業，您可以將 Tasks.await() 呼叫新增至轉譯器建立作業：

KotlinJava

import androidx.lifecycle.lifecycleScope
...
lifecycleScope.launchWhenStarted { // uses coroutine
  initializeTask.await()
}

@BackgroundThread
InterpreterApi initializeInterpreter() {
    Tasks.await(initializeTask);
    return InterpreterApi.create(...);
}

4. 執行推論

使用您建立的 interpreter 物件呼叫 run() 方法，產生推論。

KotlinJava

interpreter.run(inputBuffer, outputBuffer)

interpreter.run(inputBuffer, outputBuffer);

硬體加速

LiteRT 可讓您透過專屬硬體處理器 (例如圖形處理器 (GPU)) 加快模型效能。您可以使用名為「委派」的硬體驅動程式，充分運用這些專屬處理器。

GPU 委派程式會透過 Google Play 服務提供，並以動態方式載入，就像 Interpreter API 的 Play 服務版本一樣。

正在檢查裝置相容性

並非所有裝置都支援使用 TFLite 的 GPU 硬體加速功能。為減少錯誤和潛在的當機情形，請使用 TfLiteGpu.isGpuDelegateAvailable 方法檢查裝置是否與 GPU 委派程式相容。

使用這個方法確認裝置是否與 GPU 相容，並在 GPU 不支援時使用 CPU 做為備用選項。

useGpuTask = TfLiteGpu.isGpuDelegateAvailable(context)

有了 useGpuTask 等變數後，您就可以使用該變數來判斷裝置是否使用 GPU 委派作業。

KotlinJava

val interpreterTask = useGpuTask.continueWith { task ->
  val interpreterOptions = InterpreterApi.Options()
      .setRuntime(TfLiteRuntime.FROM_SYSTEM_ONLY)
  if (task.result) {
      interpreterOptions.addDelegateFactory(GpuDelegateFactory())
  }
  InterpreterApi.create(FileUtil.loadMappedFile(context, MODEL_PATH), interpreterOptions)
}

Task<InterpreterApi.Options> interpreterOptionsTask = useGpuTask.continueWith({ task ->
  InterpreterApi.Options options =
      new InterpreterApi.Options().setRuntime(TfLiteRuntime.FROM_SYSTEM_ONLY);
  if (task.getResult()) {
     options.addDelegateFactory(new GpuDelegateFactory());
  }
  return options;
});

搭配轉譯器 API 的 GPU

如何搭配使用 GPU 委派與解譯器 API：

更新專案依附元件，以便使用 Play 服務的 GPU 委派作業：

implementation 'com.google.android.gms:play-services-tflite-gpu:16.2.0'

在 TFlite 初始化中啟用 GPU 委派作業選項：

KotlinJava

TfLite.initialize(context,
  TfLiteInitializationOptions.builder()
    .setEnableGpuDelegateSupport(true)
    .build())

TfLite.initialize(context,
  TfLiteInitializationOptions.builder()
    .setEnableGpuDelegateSupport(true)
    .build());

在解譯器選項中啟用 GPU 委派功能：透過呼叫 addDelegateFactory() withinTranslateerApi.Options()` 將委派工廠設為 GpuDelegateFactory 設定：

KotlinJava

val interpreterOption = InterpreterApi.Options()
  .setRuntime(TfLiteRuntime.FROM_SYSTEM_ONLY)
  .addDelegateFactory(GpuDelegateFactory())

Options interpreterOption = InterpreterApi.Options()
  .setRuntime(TfLiteRuntime.FROM_SYSTEM_ONLY)
  .addDelegateFactory(new GpuDelegateFactory());

從獨立 LiteRT 遷移

如果您打算將應用程式從獨立的 LiteRT 遷移至 Play 服務 API，請參閱以下其他指南，瞭解如何更新應用程式專案程式碼：

請參閱本頁的「限制」一節，確認您的用途是否受支援。
更新程式碼前，建議您對模型進行效能和準確度檢查，特別是如果您使用的是 2.1 以下版本的 LiteRT (TF Lite)，這樣您就能有一個基準，用來與新導入的版本進行比較。
如果您已將所有程式碼遷移至使用 LiteRT 的 Play 服務 API，請從 build.gradle 檔案中移除現有的 LiteRT 執行階段程式庫依附元件 (含有 org.tensorflow:tensorflow-lite:* 的項目)，以便縮減應用程式大小。
在程式碼中找出所有出現的 new Interpreter 物件建立作業，然後修改每個物件，使其使用 InterpreterApi.create() 呼叫。新的 TfLite.initialize 為非同步，這表示在大多數情況下，它並非直接取代：您必須為呼叫完成時註冊事件監聽器。請參考步驟 3 中的程式碼片段。
使用 org.tensorflow.lite.Interpreter 或 org.tensorflow.lite.InterpreterApi 類別，將 import org.tensorflow.lite.InterpreterApi; 和 import org.tensorflow.lite.InterpreterApi.Options.TfLiteRuntime; 新增至任何來源檔案。
如果任何對 InterpreterApi.create() 的呼叫結果只有單一引數，請將 new InterpreterApi.Options() 附加至引數清單。
將 .setRuntime(TfLiteRuntime.FROM_SYSTEM_ONLY) 附加至對 InterpreterApi.create() 的任何呼叫的最後一個引數。
將所有其他出現的 org.tensorflow.lite.Interpreter 類別替換為 org.tensorflow.lite.InterpreterApi。

如要並排使用獨立 LiteRT 和 Play 服務 API，則必須使用 LiteRT (TF Lite) 2.9 以上版本。LiteRT (TF Lite) 2.8 以下版本與 Play 服務 API 版本不相容。