MediaPipe Hand Landmarker タスクを使用すると、画像内の手のランドマークを検出できます。このタスクを使用すると、手のキーポイントを特定し、その上に視覚効果をレンダリングできます。このタスクは、機械学習(ML)モデルを静的データまたは連続ストリームとして使用して画像データを処理し、画像座標で手のランドマーク、ワールド座標で手のランドマーク、検出された複数の手の利き手(左手/右手)を出力します。
[[["わかりやすい","easyToUnderstand","thumb-up"],["問題の解決に役立った","solvedMyProblem","thumb-up"],["その他","otherUp","thumb-up"]],[["必要な情報がない","missingTheInformationINeed","thumb-down"],["複雑すぎる / 手順が多すぎる","tooComplicatedTooManySteps","thumb-down"],["最新ではない","outOfDate","thumb-down"],["翻訳に関する問題","translationIssue","thumb-down"],["サンプル / コードに問題がある","samplesCodeIssue","thumb-down"],["その他","otherDown","thumb-down"]],["最終更新日 2025-01-13 UTC。"],[],[],null,["# Hand landmarks detection guide\n\nThe MediaPipe Hand Landmarker task lets you detect the landmarks of the hands in an image.\nYou can use this task to locate key points of hands and render visual effects on\nthem. This task operates on image data with a machine learning (ML) model as\nstatic data or a continuous stream and outputs hand landmarks in image\ncoordinates, hand landmarks in world coordinates and handedness(left/right hand)\nof multiple detected hands.\n\n[Try it!arrow_forward](https://mediapipe-studio.webapps.google.com/demo/hand_landmarker)\n\nGet Started\n-----------\n\nStart using this task by following one of these implementation guides for your\ntarget platform. These platform-specific guides walk you through a basic\nimplementation of this task, including a recommended model, and code example\nwith recommended configuration options:\n\n- **Android** - [Code\n example](https://github.com/google-ai-edge/mediapipe-samples/tree/main/examples/hand_landmarker/android)\n - [Guide](./android)\n- **Python** - [Code\n example](https://colab.research.google.com/github/googlesamples/mediapipe/blob/main/examples/hand_landmarker/python/hand_landmarker.ipynb)\n - [Guide](./python)\n- **Web** - [Code example](https://codepen.io/mediapipe-preview/pen/gOKBGPN) - [Guide](./web_js)\n\nTask details\n------------\n\nThis section describes the capabilities, inputs, outputs, and configuration\noptions of this task.\n\n### Features\n\n- **Input image processing** - Processing includes image rotation, resizing, normalization, and color space conversion.\n- **Score threshold** - Filter results based on prediction scores.\n\n| Task inputs | Task outputs |\n|----------------------------------------------------------------------------------------------------------------------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|\n| The Hand Landmarker accepts an input of one of the following data types: - Still images \u003c!-- --\u003e - Decoded video frames \u003c!-- --\u003e - Live video feed | The Hand Landmarker outputs the following results: - Handedness of detected hands \u003c!-- --\u003e - Landmarks of detected hands in image coordinates \u003c!-- --\u003e - Landmarks of detected hands in world coordinates |\n\n### Configurations options\n\nThis task has the following configuration options:\n\n| Option Name | Description | Value Range | Default Value |\n|---------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-------------------------------|---------------|\n| `running_mode` | Sets the running mode for the task. There are three modes: \u003cbr /\u003e IMAGE: The mode for single image inputs. \u003cbr /\u003e VIDEO: The mode for decoded frames of a video. \u003cbr /\u003e LIVE_STREAM: The mode for a livestream of input data, such as from a camera. In this mode, resultListener must be called to set up a listener to receive results asynchronously. | {`IMAGE, VIDEO, LIVE_STREAM`} | `IMAGE` |\n| `num_hands` | The maximum number of hands detected by the Hand landmark detector. | `Any integer \u003e 0` | `1` |\n| `min_hand_detection_confidence` | The minimum confidence score for the hand detection to be considered successful in palm detection model. | `0.0 - 1.0` | `0.5` |\n| `min_hand_presence_confidence` | The minimum confidence score for the hand presence score in the hand landmark detection model. In Video mode and Live stream mode, if the hand presence confidence score from the hand landmark model is below this threshold, Hand Landmarker triggers the palm detection model. Otherwise, a lightweight hand tracking algorithm determines the location of the hand(s) for subsequent landmark detections. | `0.0 - 1.0` | `0.5` |\n| `min_tracking_confidence` | The minimum confidence score for the hand tracking to be considered successful. This is the bounding box IoU threshold between hands in the current frame and the last frame. In Video mode and Stream mode of Hand Landmarker, if the tracking fails, Hand Landmarker triggers hand detection. Otherwise, it skips the hand detection. | `0.0 - 1.0` | `0.5` |\n| `result_callback` | Sets the result listener to receive the detection results asynchronously when the hand landmarker is in live stream mode. Only applicable when running mode is set to `LIVE_STREAM` | N/A | N/A |\n\nModels\n------\n\nThe Hand Landmarker uses a model bundle with two packaged models: a palm detection\nmodel and a hand landmarks detection model. You need a model bundle that\ncontains both these models to run this task.\n| **Attention:** This MediaPipe Solutions Preview is an early release. [Learn more](/edge/mediapipe/solutions/about#notice).\n\n| Model name | Input shape | Quantization type | Model Card | Versions |\n|----------------------------------------------------------------------------------------------------------------------------------------------|----------------------|-------------------|-----------------------------------------------------------------------------------------------------------------------------------------|-------------------------------------------------------------------------------------------------------------------------------|\n| [HandLandmarker (full)](https://storage.googleapis.com/mediapipe-models/hand_landmarker/hand_landmarker/float16/latest/hand_landmarker.task) | 192 x 192, 224 x 224 | float 16 | [info](https://storage.googleapis.com/mediapipe-assets/Model%20Card%20Hand%20Tracking%20(Lite_Full)%20with%20Fairness%20Oct%202021.pdf) | [Latest](https://storage.googleapis.com/mediapipe-models/hand_landmarker/hand_landmarker/float16/latest/hand_landmarker.task) |\n\nThe hand landmark model bundle detects the keypoint localization of 21\nhand-knuckle coordinates within the detected hand regions. The model was trained\non approximately 30K real-world images, as well as several rendered synthetic\nhand models imposed over various backgrounds.\n\nThe hand landmarker model bundle contains a palm detection model and a hand\nlandmarks detection model. The Palm detection model locates hands within the\ninput image, and the hand landmarks detection model identifies specific hand\nlandmarks on the cropped hand image defined by the palm detection model.\n\nSince running the palm detection model is time consuming, when in video or live\nstream running mode, Hand Landmarker uses the bounding box defined by the hand\nlandmarks model in one frame to localize the region of hands for subsequent\nframes. Hand Landmarker only re-triggers the palm detection model if the hand\nlandmarks model no longer identifies the presence of hands or fails to track the\nhands within the frame. This reduces the number of times Hand Landmarker tiggers\nthe palm detection model.\n\nTask benchmarks\n---------------\n\nHere's the task benchmarks for the whole pipeline based on the above pre-trained\nmodels. The latency result is the average latency on Pixel 6 using CPU / GPU.\n\n| Model Name | CPU Latency | GPU Latency |\n|-----------------------|-------------|-------------|\n| HandLandmarker (full) | 17.12ms | 12.27ms |"]]