Tugas MediaPipe Text Embedder memungkinkan Anda membuat representasi numerik data teks untuk
menangkap makna semantiknya. Fungsi ini sering digunakan untuk membandingkan
kesamaan semantik dari dua bagian teks menggunakan teknik perbandingan
matematika seperti Kesamaan Kosinus. Tugas ini beroperasi pada data teks dengan
model machine learning (ML), dan menghasilkan representasi numerik data
teks sebagai daftar vektor fitur berdimensi tinggi, yang juga dikenal sebagai vektor
penyematan, dalam bentuk floating point atau kuantisasi.
Mulai gunakan tugas ini dengan mengikuti salah satu panduan penerapan ini untuk
platform target Anda. Panduan khusus platform ini akan memandu Anda dalam penerapan dasar
tugas ini, termasuk model yang direkomendasikan, dan contoh kode
dengan opsi konfigurasi yang direkomendasikan:
Bagian ini menjelaskan kemampuan, input, output, dan opsi konfigurasi tugas ini.
Fitur
Pemrosesan teks input - Mendukung tokenisasi di luar grafik untuk model
tanpa tokenisasi dalam grafik.
Komputasi kemiripan penyematan - Fungsi utilitas bawaan untuk menghitung
kemiripan kosinus
antara dua vektor fitur.
Kuantisasi - Mendukung kuantisasi skalar untuk vektor fitur.
Input tugas
Output tugas
Penyematan Teks menerima jenis data input berikut:
String
Text Embedder menghasilkan daftar penyematan yang terdiri dari:
Embedding: vektor fitur itu sendiri, baik dalam bentuk floating point maupun
kuantisasi skalar.
Indeks head: indeks untuk head yang menghasilkan penyematan ini.
Nama head (opsional): nama head yang menghasilkan penyematan ini.
Opsi konfigurasi
Tugas ini memiliki opsi konfigurasi berikut:
Nama Opsi
Deskripsi
Rentang Nilai
Nilai Default
l2_normalize
Apakah akan menormalisasi vektor fitur yang ditampilkan dengan norma L2.
Gunakan opsi ini hanya jika model belum berisi Opsi TFLite L2_NORMALIZATION native. Pada umumnya, hal ini sudah terjadi dan
normalisasi L2 akan dicapai melalui inferensi TFLite tanpa memerlukan
opsi ini.
Boolean
False
quantize
Apakah penyematan yang ditampilkan harus dikuantifikasi ke byte melalui
kuantisasi skalar. Secara implisit, penyematan diasumsikan sebagai unit-norm dan
oleh karena itu, dimensi apa pun dijamin memiliki nilai dalam [-1,0, 1,0]. Gunakan
opsi l2_normalize jika tidak demikian.
Boolean
False
Model
Kami menawarkan model default yang direkomendasikan saat Anda mulai mengembangkan dengan tugas ini.
Model Universal Sentence Encoder (direkomendasikan)
Model ini menggunakan arsitektur encoder ganda
dan dilatih pada berbagai set data pertanyaan-jawaban.
Pertimbangkan pasangan kalimat berikut:
("it's a charming and often affecting journey", "what a great and fantastic trip")
("Restoran ini memiliki gimmick yang bagus", "Kita perlu memeriksa ulang detail rencana kita")
Penyematan teks dalam dua pasangan pertama akan memiliki kesamaan kosinus yang lebih tinggi
daripada penyematan dalam pasangan ketiga karena dua pasangan kalimat pertama
berbagi topik umum "sentimen perjalanan" dan "pendapat ponsel", sedangkan
pasangan kalimat ketiga tidak memiliki topik yang sama.
Perhatikan bahwa meskipun dua kalimat dalam pasangan kedua memiliki sentimen yang berlawanan,
keduanya memiliki skor kesamaan yang tinggi karena memiliki topik yang sama.
Berikut adalah tolok ukur tugas untuk seluruh pipeline berdasarkan
model terlatih di atas. Hasil latensi adalah latensi rata-rata di Pixel 6 yang menggunakan
CPU / GPU.
[[["Mudah dipahami","easyToUnderstand","thumb-up"],["Memecahkan masalah saya","solvedMyProblem","thumb-up"],["Lainnya","otherUp","thumb-up"]],[["Informasi yang saya butuhkan tidak ada","missingTheInformationINeed","thumb-down"],["Terlalu rumit/langkahnya terlalu banyak","tooComplicatedTooManySteps","thumb-down"],["Sudah usang","outOfDate","thumb-down"],["Masalah terjemahan","translationIssue","thumb-down"],["Masalah kode / contoh","samplesCodeIssue","thumb-down"],["Lainnya","otherDown","thumb-down"]],["Terakhir diperbarui pada 2025-01-13 UTC."],[],[],null,["# Text embedding guide\n\nThe MediaPipe Text Embedder task lets you create a numeric representation of text data to\ncapture its semantic meaning. This functionality is frequently used to compare\nthe semantic similarity of two pieces of text using mathematical comparison\ntechniques such as Cosine Similarity. This task operates on text data with a\nmachine learning (ML) model, and outputs a numeric representation of the text\ndata as a list of high-dimensional feature vectors, also known as embedding\nvectors, in either floating-point or quantized form.\n\n[Try it!arrow_forward](https://mediapipe-studio.webapps.google.com/demo/text_embedder)\n\nGet Started\n-----------\n\nStart using this task by following one of these implementation guides for your\ntarget platform. These platform-specific guides walk you through a basic\nimplementation of this task, including a recommended model, and code example\nwith recommended configuration options:\n\n- **Android** - [Code\n example](https://github.com/google-ai-edge/mediapipe-samples/tree/main/examples/text_embedder/android) - [Guide](./android)\n- **Python** - [Code\n example](https://colab.sandbox.google.com/github/googlesamples/mediapipe/blob/main/examples/text_embedder/python/text_embedder.ipynb) - [Guide](./python)\n- **Web** - [Code example](https://codepen.io/mediapipe-preview/pen/XWBVZmE) - [Guide](./web_js)\n\nTask details\n------------\n\nThis section describes the capabilities, inputs, outputs, and configuration\noptions of this task.\n\n### Features\n\n- **Input text processing** - Supports out-of-graph tokenization for models without in-graph tokenization.\n- **Embedding similarity computation** - Built-in utility function to compute the [cosine similarity](https://en.wikipedia.org/wiki/Cosine_similarity) between two feature vectors.\n- **Quantization** - Supports scalar quantization for the feature vectors.\n\n| Task inputs | Task outputs |\n|---------------------------------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|\n| Text Embedder accepts the following input data type: - String | Text Embedder outputs a list of embeddings consisting of: - Embedding: the feature vector itself, either in floating-point form or scalar-quantized. \u003c!-- --\u003e - Head index: the index for the head that produced this embedding. \u003c!-- --\u003e - Head name (optional): the name of the head that produced this embedding. |\n\n### Configurations options\n\nThis task has the following configuration options:\n\n| Option Name | Description | Value Range | Default Value |\n|----------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-------------|---------------|\n| `l2_normalize` | Whether to normalize the returned feature vector with L2 norm. Use this option only if the model does not already contain a native L2_NORMALIZATION TFLite Op. In most cases, this is already the case and L2 normalization is thus achieved through TFLite inference with no need for this option. | `Boolean` | `False` |\n| `quantize` | Whether the returned embedding should be quantized to bytes via scalar quantization. Embeddings are implicitly assumed to be unit-norm and therefore any dimension is guaranteed to have a value in \\[-1.0, 1.0\\]. Use the l2_normalize option if this is not the case. | `Boolean` | `False` |\n\nModels\n------\n\nWe offer a default, recommended model when you start developing with this task.\n| **Attention:** This MediaPipe Solutions Preview is an early release. [Learn more](/edge/mediapipe/solutions/about#notice).\n\n### Universal Sentence Encoder model (recommended)\n\nThis model uses a [dual encoder architecture](https://aclanthology.org/2022.emnlp-main.640.pdf)\nand was trained on various question-answer datasets.\n\nConsider the following pairs of sentences:\n\n- (\"it's a charming and often affecting journey\", \"what a great and fantastic trip\")\n- (\"I like my phone\", \"I hate my phone\")\n- (\"This restaurant has a great gimmick\", \"We need to double-check the details of our plan\")\n\nThe text embeddings in the first two pairs will have a higher cosine similarity\nthan the embeddings in the third pair because the first two pairs of sentences\nshare a common topic of \"trip sentiment\" and \"phone opinion\" respectively while\nthe third pair of sentences do not share a common topic.\n\nNote that although the two sentences in the second pair have opposing sentiments,\nthey have a high similarity score because they share a common topic.\n\n| Model name | Input shape | Quantization type | Versions |\n|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------------------------|-------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------|\n| [Universal Sentence Encoder](https://storage.googleapis.com/mediapipe-models/text_embedder/universal_sentence_encoder/float32/latest/universal_sentence_encoder.tflite) | string, string, string | None (float32) | [Latest](https://storage.googleapis.com/mediapipe-models/text_embedder/universal_sentence_encoder/float32/latest/universal_sentence_encoder.tflite) |\n\nTask benchmarks\n---------------\n\nHere's the task benchmarks for the whole pipeline based on the above\npre-trained models. The latency result is the average latency on Pixel 6 using\nCPU / GPU.\n\n| Model Name | CPU Latency | GPU Latency |\n|----------------------------|-------------|-------------|\n| Universal Sentence Encoder | 18.21ms | - |"]]