隆重推出 LiteRT：Google 為裝置端 AI (舊稱 TensorFlow Lite) 打造的高效能執行階段。

本頁面由 Cloud Translation API 翻譯而成。

iOS 手部地標偵測指南

MediaPipe Hand Landmarker 工作可讓您偵測圖片中的手部地標。本操作說明將說明如何在 iOS 應用程式中使用手勢標記工具。您可以在 GitHub 上找到這些操作說明中所述的程式碼範例。

如要進一步瞭解這項工作的功能、模型和設定選項，請參閱總覽。

程式碼範例

MediaPipe Tasks 程式碼範例是 iOS 版手勢標記應用程式的基本實作方式。這個範例會使用實體 iOS 裝置的相機，在持續的影片串流中偵測手部標記。應用程式也可以偵測裝置相片庫中圖片和影片中的手部標記。

您可以使用這個應用程式做為自有 iOS 應用程式的起點，或是在修改現有應用程式時參考這個應用程式。手勢標記器範例程式碼託管於 GitHub。

下載程式碼

以下操作說明說明如何使用 git 指令列工具，建立範例程式碼的本機副本。

下載程式碼範例：

使用下列指令複製 Git 存放區：

git clone https://github.com/google-ai-edge/mediapipe-samples

您可以選擇將 Git 例項設定為使用稀疏檢查，這樣就只會取得手勢標記範例應用程式的檔案：
```
cd mediapipe-samples
git sparse-checkout init --cone
git sparse-checkout set examples/hand_landmarker/ios/
```

建立範例程式碼的本機版本後，您可以安裝 MediaPipe 工作程式庫、使用 Xcode 開啟專案，然後執行應用程式。如需操作說明，請參閱 iOS 設定指南。

重要元件

下列檔案包含手勢標記範例應用程式的重要程式碼：

HandLandmarkerService.swift：初始化手部標記器、處理模型選取作業，並對輸入資料執行推論。
CameraViewController.swift：為即時攝影機動態饋給輸入模式實作 UI，並將結果以視覺化方式呈現。
MediaLibraryViewController.swift：為靜態圖片和影片檔案輸入模式實作 UI，並將結果以視覺化方式呈現。

設定

本節將說明設定開發環境和程式碼專案，以便使用手勢標記工具的重要步驟。如要進一步瞭解如何設定開發環境以使用 MediaPipe 工作，包括平台版本需求，請參閱 iOS 專用設定指南。

依附元件

Hand Landmarker 會使用 MediaPipeTasksVision 程式庫，必須使用 CocoaPods 安裝。這個程式庫與 Swift 和 Objective-C 應用程式相容，且不需要任何額外的語言專屬設定。

如需在 macOS 上安裝 CocoaPods 的操作說明，請參閱 CocoaPods 安裝指南。如需有關如何為應用程式建立包含必要 Pod 的 Podfile 的操作說明，請參閱「使用 CocoaPods」一文。

使用下列程式碼，在 Podfile 中新增 MediaPipeTasksVision pod：

target 'MyHandLandmarkerApp' do
  use_frameworks!
  pod 'MediaPipeTasksVision'
end

如果您的應用程式包含單元測試目標，請參閱「iOS 設定指南」，進一步瞭解如何設定 Podfile。

型號

MediaPipe Hand Landmarker 工作需要訓練過的模型，且必須與這項工作相容。如要進一步瞭解手部標記器可用的訓練模型，請參閱任務總覽的「模型」一節。

選取並下載模型，然後使用 Xcode 將模型新增至專案目錄。如需在 Xcode 專案中新增檔案的操作說明，請參閱「管理 Xcode 專案中的檔案和資料夾」。

使用 BaseOptions.modelAssetPath 屬性指定應用程式套件中的模型路徑。如需程式碼範例，請參閱下一節。

建立工作

您可以呼叫其中一個初始化器，建立手部標記工作。HandLandmarker(options:) 初始化器會接受設定選項的值。

如果您不需要使用自訂設定選項初始化的手部標記，可以使用 HandLandmarker(modelPath:) 初始化器，以預設選項建立手部標記。如要進一步瞭解設定選項，請參閱「設定總覽」。

手勢標記工作支援 3 種輸入資料類型：靜態圖片、影片檔案和即時影像串流。根據預設，HandLandmarker(modelPath:) 會初始化靜態圖片的工作。如果您希望工作初始化後處理影片檔案或直播影片串流，請使用 HandLandmarker(options:) 指定影片或直播的執行模式。直播模式也需要額外的 handLandmarkerLiveStreamDelegate 設定選項，讓手部標記器能夠以非同步方式將手部標記器結果傳送給委派函。

請選擇對應於執行模式的分頁，瞭解如何建立工作並執行推論。

Swift

圖片影片直播

import MediaPipeTasksVision

let modelPath = Bundle.main.path(forResource: "hand_landmarker",
                                      ofType: "task")

let options = HandLandmarkerOptions()
options.baseOptions.modelAssetPath = modelPath
options.runningMode = .image
options.minHandDetectionConfidence = minHandDetectionConfidence
options.minHandPresenceConfidence = minHandPresenceConfidence
options.minTrackingConfidence = minHandTrackingConfidence
options.numHands = numHands

let handLandmarker = try HandLandmarker(options: options)

import MediaPipeTasksVision

let modelPath = Bundle.main.path(forResource: "hand_landmarker",
                                      ofType: "task")

let options = HandLandmarkerOptions()
options.baseOptions.modelAssetPath = modelPath
options.runningMode = .video
options.minHandDetectionConfidence = minHandDetectionConfidence
options.minHandPresenceConfidence = minHandPresenceConfidence
options.minTrackingConfidence = minHandTrackingConfidence
options.numHands = numHands

let handLandmarker = try HandLandmarker(options: options)

import MediaPipeTasksVision

// Class that conforms to the `HandLandmarkerLiveStreamDelegate` protocol and
// implements the method that the hand landmarker calls once it finishes
// performing landmarks detection in each input frame.
class HandLandmarkerResultProcessor: NSObject, HandLandmarkerLiveStreamDelegate {

  func handLandmarker(
    _ handLandmarker: HandLandmarker,
    didFinishDetection result: HandLandmarkerResult?,
    timestampInMilliseconds: Int,
    error: Error?) {

    // Process the hand landmarker result or errors here.

  }
}

let modelPath = Bundle.main.path(
  forResource: "hand_landmarker",
  ofType: "task")

let options = HandLandmarkerOptions()
options.baseOptions.modelAssetPath = modelPath
options.runningMode = .liveStream
options.minHandDetectionConfidence = minHandDetectionConfidence
options.minHandPresenceConfidence = minHandPresenceConfidence
options.minTrackingConfidence = minHandTrackingConfidence
options.numHands = numHands

// Assign an object of the class to the `handLandmarkerLiveStreamDelegate`
// property.
let processor = HandLandmarkerResultProcessor()
options.handLandmarkerLiveStreamDelegate = processor

let handLandmarker = try HandLandmarker(options: options)

Objective-C

圖片影片直播

@import MediaPipeTasksVision;

NSString *modelPath = [[NSBundle mainBundle] pathForResource:@"hand_landmarker"
                                                      ofType:@"task"];

MPPHandLandmarkerOptions *options = [[MPPHandLandmarkerOptions alloc] init];
options.baseOptions.modelAssetPath = modelPath;
options.runningMode = MPPRunningModeImage;
options.minHandDetectionConfidence = minHandDetectionConfidence;
options.minHandPresenceConfidence = minHandPresenceConfidence;
options.minTrackingConfidence = minHandTrackingConfidence;
options.numHands = numHands;

MPPHandLandmarker *handLandmarker =
  [[MPPHandLandmarker alloc] initWithOptions:options error:nil];

@import MediaPipeTasksVision;

NSString *modelPath = [[NSBundle mainBundle] pathForResource:@"hand_landmarker"
                                                      ofType:@"task"];

MPPHandLandmarkerOptions *options = [[MPPHandLandmarkerOptions alloc] init];
options.baseOptions.modelAssetPath = modelPath;
options.runningMode = MPPRunningModeVideo;
options.minHandDetectionConfidence = minHandDetectionConfidence;
options.minHandPresenceConfidence = minHandPresenceConfidence;
options.minTrackingConfidence = minHandTrackingConfidence;
options.numHands = numHands;

MPPHandLandmarker *handLandmarker =
  [[MPPHandLandmarker alloc] initWithOptions:options error:nil];

@import MediaPipeTasksVision;

// Class that conforms to the `MPPHandLandmarkerLiveStreamDelegate` protocol
// and implements the method that the hand landmarker calls once it finishes
// performing landmarks detection in each input frame.

@interface APPHandLandmarkerResultProcessor : NSObject 

@end

@implementation APPHandLandmarkerResultProcessor

-   (void)handLandmarker:(MPPHandLandmarker *)handLandmarker
    didFinishDetectionWithResult:(MPPHandLandmarkerResult *)handLandmarkerResult
         timestampInMilliseconds:(NSInteger)timestampInMilliseconds
                           error:(NSError *)error {

    // Process the hand landmarker result or errors here.

}

@end

NSString *modelPath = [[NSBundle mainBundle] pathForResource:@"hand_landmarker"
                                                      ofType:@"task"];

MPPHandLandmarkerOptions *options = [[MPPHandLandmarkerOptions alloc] init];
options.baseOptions.modelAssetPath = modelPath;
options.runningMode = MPPRunningModeLiveStream;
options.minHandDetectionConfidence = minHandDetectionConfidence;
options.minHandPresenceConfidence = minHandPresenceConfidence;
options.minTrackingConfidence = minHandTrackingConfidence;
options.numHands = numHands;

// Assign an object of the class to the `handLandmarkerLiveStreamDelegate`
// property.
APPHandLandmarkerResultProcessor *processor =
  [APPHandLandmarkerResultProcessor new];
options.handLandmarkerLiveStreamDelegate = processor;

MPPHandLandmarker *handLandmarker =
  [[MPPHandLandmarker alloc] initWithOptions:options error:nil];

設定選項

此工作包含下列 iOS 應用程式的設定選項：

選項名稱	說明	值範圍	預設值
`running_mode`	設定工作執行模式。共有三種模式： IMAGE：單一圖片輸入模式。 VIDEO：影片解碼影格模式。 LIVE_STREAM：輸入資料 (例如來自攝影機的資料) 的直播模式。在這個模式中，必須呼叫 resultListener，才能設定事件監聽器，以非同步方式接收結果。在這個模式中，`handLandmarkerLiveStreamDelegate` 必須設為實作 `HandLandmarkerLiveStreamDelegate` 的類別例項，才能非同步接收手部地標偵測結果。	{`RunningMode.image, RunningMode.video, RunningMode.liveStream`}	`RunningMode.image`
`numHands`	手部地標偵測器偵測到的手部數量上限。	`Any integer > 0`	`1`
`minHandDetectionConfidence`	在手掌偵測模型中，手部偵測的最低信賴分數，才會視為成功。	`0.0 - 1.0`	`0.5`
`minHandPresenceConfidence`	手部地標偵測模型中手部存在分數的最低可信度分數。在影片模式和直播模式中，如果手勢地標模型的手勢存在可信度分數低於此閾值，手勢地標偵測器就會觸發手掌偵測模型。否則，輕量手勢追蹤演算法會判斷手的位置，以便後續的標記偵測。	`0.0 - 1.0`	`0.5`
`minTrackingConfidence`	手部追蹤系統判定為成功的最低可信度分數。這是目前影格和上一個影格中手的定界框交併比閾值。在 Hand Landmarker 的影片模式和串流模式中，如果追蹤失敗，Hand Landmarker 會觸發手部偵測。否則系統會略過手勢偵測。	`0.0 - 1.0`	`0.5`
`result_listener`	在手部標記處於即時串流模式時，將結果事件監聽器設為以非同步方式接收偵測結果。只有在執行模式設為 `LIVE_STREAM` 時才適用	不適用	不適用

當執行模式設為直播時，手標記器需要額外的 handLandmarkerLiveStreamDelegate 設定選項，才能讓手標記器以非同步方式提供手標記檢測結果。委派程式必須實作 handLandmarker(_:didFinishDetection:timestampInMilliseconds:error:) 方法，Hand Landmarker 會在處理每個影格的手標記偵測結果後呼叫此方法。

選項名稱	說明	值範圍	預設值
`handLandmarkerLiveStreamDelegate`	讓手部標記器在直播模式下以非同步方式接收手部標記偵測結果。將例項設為此屬性的類別必須實作 `handLandmarker(_:didFinishDetection:timestampInMilliseconds:error:)` 方法。	不適用	未設定

準備資料

您必須先將輸入圖片或影格轉換為 MPImage 物件，才能將其傳遞至手部標記器。MPImage 支援不同類型的 iOS 圖片格式，並可在任何執行模式下用於推論。如要進一步瞭解 MPImage，請參閱 MPImage API。

請根據用途和應用程式所需的執行模式，選擇 iOS 圖片格式。MPImage 接受 UIImage、CVPixelBuffer 和 CMSampleBuffer iOS 圖片格式。

UIImage

UIImage 格式非常適合下列執行模式：

圖片：應用程式套件、使用者相片庫或檔案系統中的圖片，如果以 UIImage 圖片格式編碼，即可轉換為 MPImage 物件。
影片：使用 AVAssetImageGenerator 將影片影格擷取為 CGImage 格式，然後轉換為 UIImage 圖片。

SwiftObjective-C

// Load an image on the user's device as an iOS `UIImage` object.

// Convert the `UIImage` object to a MediaPipe's Image object having the default
// orientation `UIImage.Orientation.up`.
let image = try MPImage(uiImage: image)

// Load an image on the user's device as an iOS `UIImage` object.

// Convert the `UIImage` object to a MediaPipe's Image object having the default
// orientation `UIImageOrientationUp`.
MPImage *image = [[MPPImage alloc] initWithUIImage:image error:nil];

這個範例會使用預設的 UIImage.Orientation.Up 方向初始化 MPImage。您可以使用任何支援的 UIImage.Orientation 值初始化 MPImage。Hand Landmarker 不支援鏡像方向，例如 .upMirrored、.downMirrored、.leftMirrored、.rightMirrored。

如要進一步瞭解 UIImage，請參閱 UIImage Apple 開發人員說明文件。

CVPixelBuffer

CVPixelBuffer 格式非常適合用於產生影格，並使用 iOS CoreImage 架構進行處理的應用程式。

CVPixelBuffer 格式非常適合下列執行模式：

圖片：如果應用程式在使用 iOS 的 CoreImage 架構進行一些處理後產生 CVPixelBuffer 圖片，則可在圖片執行模式下傳送至手勢地標器。
影片：可將影片影格轉換為 CVPixelBuffer 格式進行處理，然後以影片模式傳送至手勢標記器。
直播：使用 iOS 相機產生影格時，應用程式可能會先將影格轉換為 CVPixelBuffer 格式進行處理，再以直播模式傳送至手勢標記器。

SwiftObjective-C

// Obtain a CVPixelBuffer.

// Convert the `CVPixelBuffer` object to a MediaPipe's Image object having the default
// orientation `UIImage.Orientation.up`.
let image = try MPImage(pixelBuffer: pixelBuffer)

// Obtain a CVPixelBuffer.

// Convert the `CVPixelBuffer` object to a MediaPipe's Image object having the
// default orientation `UIImageOrientationUp`.
MPImage *image = [[MPPImage alloc] initWithUIImage:image error:nil];

如要進一步瞭解 CVPixelBuffer，請參閱 CVPixelBuffer Apple 開發人員說明文件。

CMSampleBuffer

CMSampleBuffer 格式會儲存統一媒體類型的媒體樣本，非常適合直播執行模式。iOS AVCaptureVideoDataOutput 會以 CMSampleBuffer 格式，以非同步方式傳送 iOS 攝影機的即時影格。

SwiftObjective-C

// Obtain a CMSampleBuffer.

// Convert the `CMSampleBuffer` object to a MediaPipe's Image object having the default
// orientation `UIImage.Orientation.up`.
let image = try MPImage(sampleBuffer: sampleBuffer)

// Obtain a `CMSampleBuffer`.

// Convert the `CMSampleBuffer` object to a MediaPipe's Image object having the
// default orientation `UIImageOrientationUp`.
MPImage *image = [[MPPImage alloc] initWithSampleBuffer:sampleBuffer error:nil];

如要進一步瞭解 CMSampleBuffer，請參閱 CMSampleBuffer Apple 開發人員說明文件。

執行工作

如要執行手部標記器，請使用指派的執行模式專用的 detect() 方法：

靜態圖片：detect(image:)
影片：detect(videoFrame:timestampInMilliseconds:)
直播：detectAsync(image:timestampInMilliseconds:)

Swift

圖片影片直播

let result = try handLandmarker.detect(image: image)

let result = try handLandmarker.detect(
    videoFrame: image,
    timestampInMilliseconds: timestamp)

try handLandmarker.detectAsync(
  image: image,
  timestampInMilliseconds: timestamp)

Objective-C

圖片影片直播

MPPHandLandmarkerResult *result =
  [handLandmarker detectInImage:image error:nil];

MPPHandLandmarkerResult *result =
  [handLandmarker detectInVideoFrame:image
             timestampInMilliseconds:timestamp
                               error:nil];

BOOL success =
  [handLandmarker detectAsyncInImage:image
             timestampInMilliseconds:timestamp
                               error:nil];

手部地標程式碼範例會進一步說明各模式的實作方式。程式碼範例可讓使用者在處理模式之間切換，但這可能不是您用途所需的功能。

注意事項：

在影片模式或直播模式下執行時，您也必須向手勢標記器工作提供輸入影格時間戳記。
在圖片或影片模式下執行時，手勢標記器工作會阻斷目前的執行緒，直到處理完輸入圖片或影格為止。為避免阻斷目前執行緒，請使用 iOS Dispatch 或 NSOperation 架構，在背景執行緒中執行處理作業。
在直播模式下執行時，手部地標器工作會立即傳回，且不會封鎖目前的執行緒。在處理每個輸入影格後，會使用手標記結果呼叫 handLandmarker(_:didFinishDetection:timestampInMilliseconds:error:) 方法。Hand Landmarker 會在專屬的序列調度佇列上以非同步方式叫用這個方法。如要在使用者介面上顯示結果，請在處理結果後將結果調度至主佇列。如果在手勢標記器任務忙於處理其他影格時呼叫 detectAsync 函式，手勢標記器會忽略新的輸入影格。

處理及顯示結果

在執行推論時，手標記器工作會傳回 HandLandmarkerResult，其中包含圖像座標中的手標記、世界座標中的手標記，以及所偵測手部的左右手。

以下是這項工作的輸出資料範例：

HandLandmarkerResult 輸出內容包含三個元件。每個元件都是陣列，其中每個元素都包含單一偵測到的手的下列結果：

慣用手

慣用手代表偵測到的手是左手還是右手。
地標

手部有 21 個地標，每個地標都由 x、y 和 z 座標組成。x 和 y 座標會分別根據圖片寬度和高度，正規化為 [0.0, 1.0]。z 座標代表地標深度，其中手腕的深度為原點。值越小，地標與相機的距離就越近。z 的大小會使用與 x 大致相同的刻度。
世界著名地標

21 個手部地標也會以世界座標呈現。每個地標都由 x、y 和 z 組成，代表以公尺為單位的實際 3D 座標，起點位於手的幾何中心。

HandLandmarkerResult:
  Handedness:
    Categories #0:
      index        : 0
      score        : 0.98396
      categoryName : Left
  Landmarks:
    Landmark #0:
      x            : 0.638852
      y            : 0.671197
      z            : -3.41E-7
    Landmark #1:
      x            : 0.634599
      y            : 0.536441
      z            : -0.06984
    ... (21 landmarks for a hand)
  WorldLandmarks:
    Landmark #0:
      x            : 0.067485
      y            : 0.031084
      z            : 0.055223
    Landmark #1:
      x            : 0.063209
      y            : -0.00382
      z            : 0.020920
    ... (21 world landmarks for a hand)

下圖是工作輸出內容的視覺化呈現：

豎起大拇指的手，並標示出手的骨骼結構