LiteRT 运算符版本

本文档介绍了 LiteRT 的操作版本控制架构。操作版本控制可让开发者向现有操作中添加新功能和参数。此外，它还保证：

向后兼容性：新的 LiteRT 实现应处理旧模型文件。
向前兼容性：旧的 LiteRT 实现应处理新版转换器生成的新模型文件，只要没有新的模型文件，功能。
转发不兼容检测：如果旧的 LiteRT 实现读取一个新模型，该模型包含它应该报告错误。

示例：向深度卷积中添加膨胀

本文档的其余部分将介绍 TFLite 中的操作版本控制，具体方法是展示向深度级卷积运算中添加膨胀参数。

您不需要掌握扩张方面的知识即可理解本文档。请注意：

将添加 2 个新的整数参数：dilation_width_factor 和 dilation_height_factor。
不支持膨胀的旧版深度级卷积内核具有同等效果将放大系数设为 1。

更改 FlatBuffer 架构

如需向操作添加新参数，请将 lite/schema/schema.fbs。

例如，深度级卷积的选项表如下所示：

table DepthwiseConv2DOptions {
  padding:Padding;
  stride_w:int;
  stride_h:int;
  depth_multiplier:int;
  fused_activation_function:ActivationFunctionType;
}

添加新参数时：

添加备注，说明哪个版本支持哪些参数。
当新实现为新添加的实现获取默认值时参数，它的工作方式应该与旧实现完全相同。

添加新参数后，表格将如下所示：

table DepthwiseConv2DOptions {
  // Parameters for DepthwiseConv version 1 or above.
  padding:Padding;
  stride_w:int;
  stride_h:int;
  depth_multiplier:int;
  fused_activation_function:ActivationFunctionType;
  // Parameters for DepthwiseConv version 2 or above.
  dilation_w_factor:int = 1;
  dilation_h_factor:int = 1;
}

应该为新的 lite/schema/schema_generated.h 重新生成文件架构。

更改 C 结构和内核实现

在 LiteRT 中，内核实现与 FlatBuffer 分离开来定义。内核从 lite/c/builtin_op_data.h。

原始深度级卷积参数如下所示：

typedef struct {
  TfLitePadding padding;
  int stride_width;
  int stride_height;
  int depth_multiplier;
  TfLiteFusedActivation activation;
} TfLiteDepthwiseConvParams;

与 FlatBuffer 架构一样，添加注释来指示支持的版本。结果如下所示：

typedef struct {
  // Parameters for DepthwiseConv version 1 or above.
  TfLitePadding padding;
  int stride_width;
  int stride_height;
  int depth_multiplier;
  TfLiteFusedActivation activation;
  // Parameters for DepthwiseConv version 2 or above.
  int dilation_width_factor;
  int dilation_height_factor;
} TfLiteDepthwiseConvParams;

另请更改内核实现以读取新添加的参数从 C 结构构建而成。此处省略详细信息。

更改 FlatBuffer 读取代码

读取 FlatBuffer 并生成 C 结构的逻辑位于 lite/core/api/flatbuffer_conversions.cc。

更新该文件以处理新参数，如下所示：

TfLiteStatus ParseDepthwiseConv2D(const Operator* op,
                                  ErrorReporter* error_reporter,
                                  BuiltinDataAllocator* allocator,
                                  void** builtin_data) {
  CheckParsePointerParams(op, error_reporter, allocator, builtin_data);

  SafeBuiltinDataAllocator safe_allocator(allocator);

  std::unique_ptr<TfLiteDepthwiseConvParams,
                  SafeBuiltinDataAllocator::BuiltinDataDeleter>
      params = safe_allocator.Allocate<TfLiteDepthwiseConvParams>();
  TF_LITE_ENSURE(error_reporter, params != nullptr);

  const DepthwiseConv2DOptions* schema_params =
      op->builtin_options_as_DepthwiseConv2DOptions();

  if (schema_params != nullptr) {
    params->padding = ConvertPadding(schema_params->padding());
    params->stride_width = schema_params->stride_w();
    params->stride_height = schema_params->stride_h();
    params->depth_multiplier = schema_params->depth_multiplier();
    params->activation =
        ConvertActivation(schema_params->fused_activation_function());

    params->dilation_width_factor = schema_params->dilation_w_factor();
    params->dilation_height_factor = schema_params->dilation_h_factor();
  }

  *builtin_data = params.release();
  return kTfLiteOk;
}

您无需在此处检查操作版本。当新实现会读取缺少放大系数的旧模型文件，它会将 1 作为默认值，新内核将与旧内核一致。

更改内核注册

MutableOpResolver（在 lite/mutable_op_resolver.h 中定义）提供了一些函数来注册操作内核。最低和最高版本为 1 x 默认值：

void AddBuiltin(tflite::BuiltinOperator op, TfLiteRegistration* registration,
                int min_version = 1, int max_version = 1);
void AddCustom(const char* name, TfLiteRegistration* registration,
               int min_version = 1, int max_version = 1);

内置操作在 lite/kernels/register.cc 中注册。在此示例中我们实现了一个可以处理 DepthwiseConv2D 版本 1 的新操作内核，因此，我们需要更改此行：

AddBuiltin(BuiltinOperator_DEPTHWISE_CONV_2D, Register_DEPTHWISE_CONV_2D());

to:

AddBuiltin(BuiltinOperator_DEPTHWISE_CONV_2D, Register_DEPTHWISE_CONV_2D(),
             /* min_version = */ 1,
             /* max_version = */ 2);

更改 TFLite 操作版本

下一步是让 TFLite 填充运行 TFLite 所需的最低版本，执行操作。在此示例中，它表示：

当放大系数全为 1 时，填充 version=1。
否则填充 version=2。

修改 GetBuiltinOperatorVersion 中的运算符 lite/tools/versioning/op_version.cc中，将新版本添加到 DepthwiseConv2D:

case BuiltinOperator_DEPTHWISE_CONV_2D:
  auto depthwise_conv_params =
      reinterpret_cast<TfLiteDepthwiseConvParams*>(op_sig.builtin_data);
  TFLITE_DCHECK(depthwise_conv_params != nullptr);
  if (depthwise_conv_params->dilation_width_factor != 1 ||
       depthwise_conv_params->dilation_height_factor != 1) {
    return 2;
  }
  return 1;

更新运营商版本映射

最后一步是将新版本信息添加到运营商版本映射中。这个因为我们需要生成模型所需的运行时版本。

为此，您需要在 lite/tools/versioning/runtime_version.cc。

在此示例中，您需要将以下条目添加到 op_version_map 中：

{ {BuiltinOperator_DEPTHWISE_CONV_2D, 2}, %CURRENT_RUNTIME_VERSION%}

其中 %CURRENT_RUNTIME_VERSION% 对应于当前运行时版本 tensorflow/core/public/version.h 中所定义。

委托实现

LiteRT 提供了一个委托 API，硬件后端。在委托的 Prepare 函数中，检查版本是否为委托代码中的所有节点均支持这一功能。

const int kMaxVersion = 1;
TfLiteNode* node;
TfLiteRegistration* registration = nullptr;
TF_LITE_ENSURE_STATUS(context->GetNodeAndRegistration(context, node_index, &node, &registration));

if (registration->version > kMaxVersion) {
  // Reject the node if the version isn't supported.
}

即使委托仅支持版本 1 op，也必须这样做，因此在获得更高版本的操作时，委托可以检测到不兼容情况。