Baza modeli zawiera kolekcję najnowocześniejszych modeli uczenia maszynowego do obsługi funkcji związanych z obrazem, tekstem i dźwiękiem. Te modele są zoptymalizowane do używania z pakietem Google Tensor SDK, co pozwala wprowadzać zaawansowane funkcje AI bezpośrednio na urządzenia Pixel i zapewnia płynne działanie na urządzeniu.
Szacowanie głębi
| Model | Licencja |
|---|---|
| depth_anything_v2 | Apache-2.0 |
| midas_v2_1 | BSD-3-Clause |
Rekonstrukcja twarzy
| Model | Licencja |
|---|---|
| facemap_3dmm | BSD-3-Clause |
Rozumienie obrazów i tekstu
| Model | Licencja |
|---|---|
| clip | MIT |
| mobileclip_image_encoder | MIT |
| mobileclip_text_encoder | MIT |
| tinyclip | MIT |
Klasyfikacja obrazów
| Model | Licencja |
|---|---|
| alexnet | BSD-3-Clause |
| beit | BSD-3-Clause |
| convnext_base | BSD-3-Clause |
| convnext_tiny | BSD-3-Clause |
| densenet121 | BSD-3-Clause |
| efficientformer_l1 | Apache-2.0 |
| efficientformerv2_s0 | Apache-2.0 |
| efficientnet_b0 | BSD-3-Clause |
| efficientnet_b1 | BSD-3-Clause |
| efficientnet_b2 | BSD-3-Clause |
| efficientnet_b3 | BSD-3-Clause |
| efficientnet_b4 | BSD-3-Clause |
| efficientnet_b5 | BSD-3-Clause |
| efficientnet_b6 | BSD-3-Clause |
| efficientnet_b7 | BSD-3-Clause |
| efficientnet_v2_s | BSD-3-Clause |
| efficientnetv2_m | APACHE-2.0 |
| efficientvit_cls_b2 | BSD-3-Clause |
| efficientvit_cls_l2 | BSD-3-Clause |
| efficientvit_seg_l2 | APACHE-2.0 |
| googlenet | BSD-3-Clause |
| inception_v3 | BSD-3-Clause |
| levit | APACHE-2.0 |
| maxvit_t | BSD-3-Clause |
| mnasnet0 | BSD-3-Clause |
| mobile_vit | BSD-3-Clause |
| mobilenet_v2 | APACHE-2.0 |
| mobilenet_v3_large | BSD-3-Clause |
| mobilenet_v3_small | BSD-3-Clause |
| mobilenetv4_conv_l | BSD-3-Clause |
| mobilenetv4_conv_m | BSD-3-Clause |
| mobilenetv4_conv_s | BSD-3-Clause |
| mobilenetv4_hybrid_l | BSD-3-Clause |
| mobilenetv4_hybrid_medium | APACHE-2.0 |
| nfnet | BSD-3-Clause |
| pvt_v2_b1 | BSD-3-Clause |
| pvt_v2_b3 | BSD-3-Clause |
| regnety | APACHE-2.0 |
| resnest14d | BSD-3-Clause |
| resnet101 | BSD-3-Clause |
| resnet152 | BSD-3-Clause |
| resnet18 | BSD-3-Clause |
| resnet50 | BSD-3-Clause |
| resnext101 | AI-HUB-MODELS |
| resnext50 | BSD-3-Clause |
| shufflenet_v2 | BSD-3-Clause |
| squeezenet1 | BSD-3-Clause |
| swin_small | BSD-3-Clause |
| swin_tiny | BSD-3-Clause |
| tf_efficientnetv2_m | APACHE-2.0 |
| vgg16 | BSD-3-Clause |
| vit_base_patch16 | APACHE-2.0 |
| vit_small_patch16 | BSD-3-Clause |
| wide_resnet101 | BSD-3-Clause |
| wide_resnet50 | BSD-3-Clause |
Segmentacja obrazu
| Model | Licencja |
|---|---|
| hrnet_w48_ocr | MIT |
| mediapipe_selfie | APACHE-2.0 |
| unet_segmentation | GPL-3.0 |
Superrozdzielczość obrazu
| Model | Licencja |
|---|---|
| esrgan | APACHE-2.0 |
Wykrywanie obiektów
| Model | Licencja |
|---|---|
| 3d_deep_box | MIT |
| conditional_detr_resnet50 | Apache-2.0 |
| detr_resnet50 | Apache-2.0 |
| detr_resnet50_dc5 | Apache-2.0 |
| detr_resnet101 | Apache-2.0 |
| detr_resnet101_dc5 | Apache-2.0 |
| faceattribnet | AI-HUB-MODELS |
| lightweight_face_detection | AI-HUB-MODELS |
| mediapipe_hand_detection | APACHE-2.0 |
| person_foot_detection | AI-HUB-MODELS |
| ppe_detection | AI-HUB-MODELS |
| yolo_v4 | Apache-2.0 |
| yolo_v6 | GPL-3.0 |
| yolo_v7 | GPL-3.0 |
| yolos_tiny | APACHE-2.0 |
| yolox_tiny | APACHE-2.0 |
Szacowanie pozycji
| Model | Licencja |
|---|---|
| hrnet_pose | MIT |
| lite_hrnet_pose | APACHE-2.0 |
| mediapipe_pose | APACHE-2.0 |
| movenet | MIT |
Odpowiadanie na pytania
| Model | Licencja |
|---|---|
| tinyroberta | CC-BY-4.0 |
Segmentacja semantyczna
| Model | Licencja |
|---|---|
| bgnet | Apache-2.0 |
| bisenet | Brak pliku licencji |
| ddrnet23_slim | MIT |
| deeplabv3_mobilenet_v3_large | BSD-3-Clause |
| deeplabv3_plus_mobilenet | MIT |
| deeplabv3_resnet101 | BSD-3-Clause |
| deeplabv3_resnet50 | BSD-3-Clause |
| fcn_resnet50 | BSD-3-Clause |
| ffnet_122ns_lowres | BSD-3-Clause |
| ffnet_40s | BSD-3-Clause |
| ffnet_54s | BSD-3-Clause |
| ffnet_78s_lowres | BSD-3-Clause |
| isnet | Apache 2.0 |
| lraspp_mobilenet_v3_large | BSD-3-Clause |
| sam_vit_b | APACHE-2.0 |
| sam_vit_l | APACHE-2.0 |
| segformer | NVIDIA-SCSL |
| segment_anything_model | Apache-2.0 |
| u2net_full | APACHE-2.0 |
| u2net_lite | APACHE-2.0 |
Rozpoznawanie mowy
| Model | Licencja |
|---|---|
| deepspeech | BSD-2-Clause |
| torchaudio_emformer_rnnt_base | BSD-2-Clause |
| wav2vec2_base_960h | APACHE-2.0 |
Superrozdzielczość
| Model | Licencja |
|---|---|
| quicksrnet_large | BSD-3-Clause |
| quicksrnet_small | BSD-3-Clause |
| real_esrgan_general_x4v3 | BSD-3-Clause |
| real_esrgan_x4plus | BSD-3-Clause |
| xlsr | BSD-3-Clause |
Klasyfikacja tekstu
| Model | Licencja |
|---|---|
| distilbert | Apache-2.0 |
| mobilebert | Apache-2.0 |