隆重推出 LiteRT：Google 為裝置端 AI (舊稱 TensorFlow Lite) 打造的高效能執行階段。

本頁面由 Cloud Translation API 翻譯而成。

Python 的姿勢地標偵測指南

MediaPipe Pose Drager 任務可讓你偵測圖片中人體的人物地標影片。您可以用這項工作找出主要的身體位置、分析情況以及分類運動這項工作採用的機器學習 (ML) 模型適用於單一圖片或影片這項工作會輸出圖片中的身體姿勢地標以及 3D 世界座標。

如需上述指示中所述的程式碼範例，請前往 GitHub。進一步瞭解功能、模型和設定選項請參閱總覽。

程式碼範例

Pose 地標 er 範例程式碼提供了完整實作執行相關作業這個程式碼可協助您測試這項工作第一步是建立自己的姿勢偵測器您可以查看、執行編輯 Pose Lander 程式碼範例只要使用網路瀏覽器即可。

如果您要為 Raspberry Pi 實作 Pose Lander，請參閱 Raspberry Pi 範例 app。

設定

本節說明設定開發環境的重要步驟，以及專用的程式碼專案如需設定開發環境以使用 MediaPipe 工作，包括：平台版本需求，請參閱 Python 設定指南。

套件

MediaPipe Pose Landmarker 工作需要 mediapipe PyPI 套件。您可以透過下列指令安裝及匯入這些依附元件：

$ python -m pip install mediapipe

匯入

匯入下列類別來存取 Pose 地標 er 工作函式：

import mediapipe as mp
from mediapipe.tasks import python
from mediapipe.tasks.python import vision

型號

MediaPipe Pose Lander 工作需要經過訓練且與此系統相容的模型工作。如要進一步瞭解 Pose 地標 er 可用的已訓練模型，請參閱：工作總覽的「模型」一節。

選取並下載模型，然後儲存在本機目錄中：

model_path = '/absolute/path/to/pose_landmarker.task'

使用 BaseOptions 物件 model_asset_path 參數指定路徑要使用的模型如需程式碼範例，請參閱下一節。

建立工作

MediaPipe Pose Landmarker 工作會使用 create_from_options 函式來設定工作。create_from_options 函式接受值來處理設定選項若需更多資訊，請參閲設定選項。

下列程式碼示範如何建構及設定這項工作。

這些樣本還會顯示圖片的工作建構方式的不同版本，影片檔案和直播

圖片

import mediapipe as mp

BaseOptions = mp.tasks.BaseOptions
PoseLandmarker = mp.tasks.vision.PoseLandmarker
PoseLandmarkerOptions = mp.tasks.vision.PoseLandmarkerOptions
VisionRunningMode = mp.tasks.vision.RunningMode

options = PoseLandmarkerOptions(
    base_options=BaseOptions(model_asset_path=model_path),
    running_mode=VisionRunningMode.IMAGE)

with PoseLandmarker.create_from_options(options) as landmarker:
  # The landmarker is initialized. Use it here.
  # ...

影片

import mediapipe as mp

BaseOptions = mp.tasks.BaseOptions
PoseLandmarker = mp.tasks.vision.PoseLandmarker
PoseLandmarkerOptions = mp.tasks.vision.PoseLandmarkerOptions
VisionRunningMode = mp.tasks.vision.RunningMode

# Create a pose landmarker instance with the video mode:
options = PoseLandmarkerOptions(
    base_options=BaseOptions(model_asset_path=model_path),
    running_mode=VisionRunningMode.VIDEO)

with PoseLandmarker.create_from_options(options) as landmarker:
  # The landmarker is initialized. Use it here.
  # ...

直播

import mediapipe as mp

BaseOptions = mp.tasks.BaseOptions
PoseLandmarker = mp.tasks.vision.PoseLandmarker
PoseLandmarkerOptions = mp.tasks.vision.PoseLandmarkerOptions
PoseLandmarkerResult = mp.tasks.vision.PoseLandmarkerResult
VisionRunningMode = mp.tasks.vision.RunningMode

# Create a pose landmarker instance with the live stream mode:
def print_result(result: PoseLandmarkerResult, output_image: mp.Image, timestamp_ms: int):
    print('pose landmarker result: {}'.format(result))

options = PoseLandmarkerOptions(
    base_options=BaseOptions(model_asset_path=model_path),
    running_mode=VisionRunningMode.LIVE_STREAM,
    result_callback=print_result)

with PoseLandmarker.create_from_options(options) as landmarker:
  # The landmarker is initialized. Use it here.
  # ...

如需建立 Pose 地標和圖片的完整範例，請參閱程式碼範例。

設定選項

這項工作有下列 Python 應用程式設定選項：

選項名稱	說明	值範圍	預設值
`running_mode`	設定任務的執行模式。在架構中模式：圖片：單一圖片輸入模式。 VIDEO：影片已解碼的影格模式。 LIVE_STREAM：輸入串流模式擷取的資訊等。在此模式下， resultListener 設定接聽程式來接收結果以非同步方式載入物件	{`IMAGE, VIDEO, LIVE_STREAM`}	`IMAGE`
`num_poses`	可偵測的姿勢數量上限擺好姿勢的地標師。	`Integer > 0`	`1`
`min_pose_detection_confidence`	姿勢偵測必須達到的最低可信度分數。	`Float [0.0,1.0]`	`0.5`
`min_pose_presence_confidence`	擺出姿勢的最低可信度分數姿勢地標偵測結果的分數	`Float [0.0,1.0]`	`0.5`
`min_tracking_confidence`	姿勢追蹤的最低可信度分數才算是成功	`Float [0.0,1.0]`	`0.5`
`output_segmentation_masks`	Pose 地標 er 是否會為偵測到的警示輸出區隔遮罩姿勢。	`Boolean`	`False`
`result_callback`	設定結果事件監聽器以接收地標結果以非同步方式顯示參與者只有在執行模式設為「`LIVE_STREAM`」時才能使用	`ResultListener`	`N/A`

準備資料

準備輸入圖片檔案或 numpy 陣列然後將其轉換為 mediapipe.Image 物件如果輸入內容是影片檔案或透過網路攝影機直播，可以使用如 OpenCV，會將輸入影格載入為 numpy 陣列。

圖片

import mediapipe as mp

# Load the input image from an image file.
mp_image = mp.Image.create_from_file('/path/to/image')

# Load the input image from a numpy array.
mp_image = mp.Image(image_format=mp.ImageFormat.SRGB, data=numpy_image)

影片

import mediapipe as mp

# Use OpenCV’s VideoCapture to load the input video.

# Load the frame rate of the video using OpenCV’s CV_CAP_PROP_FPS
# You’ll need it to calculate the timestamp for each frame.

# Loop through each frame in the video using VideoCapture#read()

# Convert the frame received from OpenCV to a MediaPipe’s Image object.
mp_image = mp.Image(image_format=mp.ImageFormat.SRGB, data=numpy_frame_from_opencv)

直播

import mediapipe as mp

# Use OpenCV’s VideoCapture to start capturing from the webcam.

# Create a loop to read the latest frame from the camera using VideoCapture#read()

# Convert the frame received from OpenCV to a MediaPipe’s Image object.
mp_image = mp.Image(image_format=mp.ImageFormat.SRGB, data=numpy_frame_from_opencv)

執行工作

Pose 地標 er 使用 detect、detect_for_video 和 detect_async 函式，用於觸發推論。在姿勢地標方面，這牽涉到預先處理輸入資料及偵測圖像中的姿勢。

以下程式碼示範如何使用工作模型執行處理程序。

圖片

# Perform pose landmarking on the provided single image.
# The pose landmarker must be created with the image mode.
pose_landmarker_result = landmarker.detect(mp_image)

影片

# Perform pose landmarking on the provided single image.
# The pose landmarker must be created with the video mode.
pose_landmarker_result = landmarker.detect_for_video(mp_image, frame_timestamp_ms)

直播

# Send live image data to perform pose landmarking.
# The results are accessible via the `result_callback` provided in
# the `PoseLandmarkerOptions` object.
# The pose landmarker must be created with the live stream mode.
landmarker.detect_async(mp_image, frame_timestamp_ms)

注意事項：

以錄影模式或直播模式執行時會提供 Pose 地標 er 工作，做為輸入影格的時間戳記。
無論是在圖片或影片模型中跑步時，Pose 地標工作會阻斷目前的執行緒，直到執行緒處理完成輸入圖片，相框。
以直播模式跑步時，Pose Lander 工作回來而且不會封鎖目前的執行緒。就會叫用結果每次處理完輸入影格如果在 Pose 地標工作時呼叫偵測功能工作正忙於處理另一個影格，該工作會忽略新的輸入框。

如需在圖片上執行「Pose 地標」的完整範例，請參閱程式碼範例。

處理及顯示結果

Pose 地標 er 會為每項偵測傳回 poseLandmarkerResult 物件此程序的第一步是將程式碼簽入執行所有單元測試的存放區中結果物件包含每個姿勢地標的座標。

以下範例顯示這項工作的輸出資料範例：

PoseLandmarkerResult:
  Landmarks:
    Landmark #0:
      x            : 0.638852
      y            : 0.671197
      z            : 0.129959
      visibility   : 0.9999997615814209
      presence     : 0.9999984502792358
    Landmark #1:
      x            : 0.634599
      y            : 0.536441
      z            : -0.06984
      visibility   : 0.999909
      presence     : 0.999958
    ... (33 landmarks per pose)
  WorldLandmarks:
    Landmark #0:
      x            : 0.067485
      y            : 0.031084
      z            : 0.055223
      visibility   : 0.9999997615814209
      presence     : 0.9999984502792358
    Landmark #1:
      x            : 0.063209
      y            : -0.00382
      z            : 0.020920
      visibility   : 0.999976
      presence     : 0.999998
    ... (33 world landmarks per pose)
  SegmentationMasks:
    ... (pictured below)

輸出結果包含正規化座標 (Landmarks) 和世界每個地標的座標 (WorldLandmarks)。

輸出結果包含下列正規化座標 (Landmarks)：

x 和 y：地標座標，由 0.0 和 1.0 正規化圖片寬度 (x) 和高度 (y)。
z：地標深度，月經中點的深度為來源。值越小，地標越接近相機位置。 z 的規模與 x 幾乎相同。
visibility：地標在圖片中可見的可能性。

輸出結果包含下列世界座標 (WorldLandmarks)：

x、y 和 z：實際 3D 座標 (以公尺為單位)，這兩個字的起源
visibility：地標在圖片中可見的可能性。

下圖是工作輸出內容的視覺化呈現：

選用的區隔遮罩代表每個像素歸屬的可能性並提供給偵測到的人下圖是工作輸出：

這個範例程式碼顯示如何顯示查看工作傳回的結果程式碼範例。