隆重推出 Google AI Edge Portal：大规模对边缘 AI 进行基准测试。注册以在非公开预览期间申请访问权限。

使用 Explainer API 的 GPU 加速代理

使用图形处理单元 (GPU) 运行机器学习 (ML) 模型可以极大地改进网站的性能和用户体验，采用机器学习技术的应用。在 Android 设备上，您可以启用 delegate 和以下 API 之一：

Interpreter API - 本指南
原生 (C/C++) API - 指南

本页介绍如何为 LiteRT 模型启用 GPU 加速，具体代码如下：使用 Interpreter API 的 Android 应用。详细了解如何使用 GPU LiteRT 委托，包括最佳实践和高级技术，请参阅 GPU 代理页面。

将 GPU 与 LiteRT 结合使用与 Google Play 服务

LiteRT 解释器 API提供了一系列用于构建机器学习应用的通用 API。此部分介绍了如何将 GPU 加速器代理与这些 API 和将 LiteRT 与 Google Play 服务搭配使用。

推荐将 LiteRT 与 Google Play 服务搭配使用在 Android 上使用 LiteRT 的路径。如果您的应用以设备为目标未运行 Google Play，请参阅 GPU 与 Interpreter API 以及独立的 LiteRT 部分。

添加项目依赖项（使用 .toml 版本目录）

更新项目的 libs.versions.toml 文件

[libraries]
...
tflite-gpu = { module = "com.google.ai.edge.litert:litert-gpu", version = "2.X.Y" }
tflite-gpu-api = { module = "com.google.ai.edge.litert:litert-gpu-api", version = "2.X.Y" }
...

在应用的 build.gradle.kts 中添加项目依赖项

dependencies {
  ...
  implementation(libs.tflite.gpu)
  implementation(libs.tflite.gpu.api)
  ...
}

添加项目依赖项

如需启用对 GPU 代理的访问权限，请将对应用的build.gradlecom.google.android.gms:play-services-tflite-gpu 文件：

dependencies {
    ...
    implementation 'com.google.android.gms:play-services-tflite-java:16.4.0'
    implementation 'com.google.android.gms:play-services-tflite-gpu:16.4.0'
}

启用 GPU 加速

然后，使用支持 GPU 的 Google Play 服务初始化 LiteRT：

Kotlin

val useGpuTask = TfLiteGpu.isGpuDelegateAvailable(context)

val interpreterTask = useGpuTask.continueWith { useGpuTask ->
  TfLite.initialize(context,
      TfLiteInitializationOptions.builder()
      .setEnableGpuDelegateSupport(useGpuTask.result)
      .build())
  }

Java

Task<boolean> useGpuTask = TfLiteGpu.isGpuDelegateAvailable(context);

Task<Options> interpreterOptionsTask = useGpuTask.continueWith({ task ->
  TfLite.initialize(context,
  TfLiteInitializationOptions.builder()
    .setEnableGpuDelegateSupport(true)
    .build());
});

您最终可以初始化传递 GpuDelegateFactory 的解释器至 InterpreterApi.Options：

Kotlin

    val options = InterpreterApi.Options()
      .setRuntime(TfLiteRuntime.FROM_SYSTEM_ONLY)
      .addDelegateFactory(GpuDelegateFactory())

    val interpreter = InterpreterApi(model, options)

    // Run inference
    writeToInput(input)
    interpreter.run(input, output)
    readFromOutput(output)

Java

    Options options = InterpreterApi.Options()
      .setRuntime(TfLiteRuntime.FROM_SYSTEM_ONLY)
      .addDelegateFactory(new GpuDelegateFactory());

    Interpreter interpreter = new InterpreterApi(model, options);

    // Run inference
    writeToInput(input);
    interpreter.run(input, output);
    readFromOutput(output);

GPU 代理还可与 Android Studio 中的机器学习模型绑定搭配使用。对于请参阅使用 Vertex AI Workbench 元数据。

将 GPU 与独立 LiteRT 搭配使用

如果您的应用针对的是未运行 Google Play 的设备，则将 GPU 代理绑定到您的应用，并将其与 LiteRT 的独立版本。

添加项目依赖项

如需启用对 GPU 代理的访问权限，请将 com.google.ai.edge.litert:litert-gpu-delegate-plugin到您应用的 build.gradle 文件：

dependencies {
    ...
    implementation 'com.google.ai.edge.litert:litert'
    implementation 'com.google.ai.edge.litert:litert-gpu'
    implementation 'com.google.ai.edge.litert:litert-gpu-api'
}

启用 GPU 加速

然后使用 TfLiteDelegate 在 GPU 上运行 LiteRT。在 Java 中，您可以指定 GpuDelegate 至 Interpreter.Options。

Kotlin

      import org.tensorflow.lite.Interpreter
      import org.tensorflow.lite.gpu.CompatibilityList
      import org.tensorflow.lite.gpu.GpuDelegate

      val compatList = CompatibilityList()

      val options = Interpreter.Options().apply{
          if(compatList.isDelegateSupportedOnThisDevice){
              // if the device has a supported GPU, add the GPU delegate
              val delegateOptions = compatList.bestOptionsForThisDevice
              this.addDelegate(GpuDelegate(delegateOptions))
          } else {
              // if the GPU is not supported, run on 4 threads
              this.setNumThreads(4)
          }
      }

      val interpreter = Interpreter(model, options)

      // Run inference
      writeToInput(input)
      interpreter.run(input, output)
      readFromOutput(output)

Java

      import org.tensorflow.lite.Interpreter;
      import org.tensorflow.lite.gpu.CompatibilityList;
      import org.tensorflow.lite.gpu.GpuDelegate;

      // Initialize interpreter with GPU delegate
      Interpreter.Options options = new Interpreter.Options();
      CompatibilityList compatList = CompatibilityList();

      if(compatList.isDelegateSupportedOnThisDevice()){
          // if the device has a supported GPU, add the GPU delegate
          GpuDelegate.Options delegateOptions = compatList.getBestOptionsForThisDevice();
          GpuDelegate gpuDelegate = new GpuDelegate(delegateOptions);
          options.addDelegate(gpuDelegate);
      } else {
          // if the GPU is not supported, run on 4 threads
          options.setNumThreads(4);
      }

      Interpreter interpreter = new Interpreter(model, options);

      // Run inference
      writeToInput(input);
      interpreter.run(input, output);
      readFromOutput(output);

量化模型

默认情况下，Android GPU 委托库支持量化模型。您不必须对代码进行任何更改，才能将量化模型与 GPU 委托搭配使用。通过以下部分介绍了如何为测试或实验目的。

停用量化模型支持

以下代码展示了如何停用对量化模型的支持。

Java

GpuDelegate delegate = new GpuDelegate(new GpuDelegate.Options().setQuantizedModelsAllowed(false));

Interpreter.Options options = (new Interpreter.Options()).addDelegate(delegate);

如需详细了解如何通过 GPU 加速运行量化模型，请参阅 GPU 代理概览。