生成式人工智慧 (GenAI) 可產生更多樣化的內容,規模遠超乎以往的想像。雖然這類使用行為大多是為了合法目的,但仍有擔憂,認為這可能會導致錯誤資訊和錯誤歸屬問題。水印是減輕這些潛在影響的一種技巧。我們可以為 AI 生成的內容套用人眼難以察覺的浮水印,偵測模型則可為任意內容評分,指出內容是否已加上浮水印。
SynthID 是 Google DeepMind 的一項技術,可將數位浮水印直接嵌入 AI 生成的圖像、音訊、文字或影片中,藉此標記和辨識 AI 生成內容。我們已將 SynthID Text 開放原始碼,讓開發人員可以為文字生成功能加上浮水印。如要進一步瞭解此方法的技術說明,請參閱 Nature 上的論文。
[[["容易理解","easyToUnderstand","thumb-up"],["確實解決了我的問題","solvedMyProblem","thumb-up"],["其他","otherUp","thumb-up"]],[["缺少我需要的資訊","missingTheInformationINeed","thumb-down"],["過於複雜/步驟過多","tooComplicatedTooManySteps","thumb-down"],["過時","outOfDate","thumb-down"],["翻譯問題","translationIssue","thumb-down"],["示例/程式碼問題","samplesCodeIssue","thumb-down"],["其他","otherDown","thumb-down"]],["上次更新時間:2025-04-09 (世界標準時間)。"],[],[],null,["# SynthID: Tools for watermarking and detecting LLM-generated Text\n\nGenerative artificial intelligence (GenAI) can generate a wider array of highly\ndiverse content at scales previously unimagined. While the majority of this use\nis for legitimate purposes, there is concern that it could contribute to\nmisinformation and misattribution problems. Watermarking is one technique for\nmitigating these potential impacts. Watermarks that are imperceptible to humans\ncan be applied to AI-generated content, and detection models can score arbitrary\ncontent to indicate the likelihood that it has been watermarked.\n\n[SynthID](https://deepmind.google/technologies/synthid/) is a technology from Google DeepMind that watermarks and\nidentifies AI-generated content by embedding digital watermarks directly into\nAI-generated images, audio, text or video. SynthID Text has been open sourced\nto make watermarking for text generation available to developers. You can read\nthe [paper in *Nature*](https://www.nature.com/articles/s41586-024-08025-4) for a more complete technical\ndescription of the method.\n\nA production-grade implementation of SynthID Text is available in the\n[Hugging Face Transformers v4.46.0+](https://huggingface.co/blog/synthid-text), which you can try out in\nthe official [SynthID Text Space](https://huggingface.co/spaces/google/synthid-text). A reference implementation\nis also available [on GitHub](https://github.com/google-deepmind/synthid-text) that may be useful for open source\nmaintainers and contributors looking to bring this technique to other\nframeworks.\n\nWatermark application\n---------------------\n\nPractically speaking, SynthID Text is a logits processor, applied to your\nmodel's generation pipeline after [Top-K and Top-P](https://cloud.google.com/vertex-ai/generative-ai/docs/learn/prompts/adjust-parameter-values),\nthat augments the model's logits using a pseudorandom *g* -function to encode\nwatermarking information in a way that helps you determine if the text was\ngenerated by your model, without significantly affecting text quality. See the\n[paper](https://www.nature.com/articles/s41586-024-08025-4) for a complete technical description of the algorithm\nand analyses of how different configuration values affect performance.\n\nWatermarks are [configured](https://huggingface.co/docs/transformers/v4.46.0/en/internal/generation_utils#transformers.SynthIDTextWatermarkingConfig) to parameterize the *g* -function\nand how it is applied during generation. Each watermarking configuration you use\n***should be stored securely and privately***, otherwise your watermark may be\ntrivially replicable by others.\n\nYou must define two parameters in every watermarking configuration:\n\n- The `keys` parameter is a list of unique, random integers that are used to compute *g* -function scores across the model's vocabulary. The length of this list determines how many layers of watermarking are applied. See Appendix C.1 in the [paper](https://www.nature.com/articles/s41586-024-08025-4) for more details.\n- The `ngram_len` parameter is used to balance robustness and detectability; the larger the value the more detectable the watermark will be, at the cost of being more brittle to changes. A length of 5 is a good default value.\n\nYou can further configure the watermark based on your performance needs:\n\n- A sampling table is configured by two properties, `sampling_table_size` and `sampling_table_seed`. You want to use a `sampling_table_size` of at least \\\\( 2\\^{16} \\\\) to ensure an unbiased and stable *g* -function when sampling, but be aware that the size of the sampling table impacts the amount of memory required at inference time. You can use any integer you like as the `sampling_table_seed`.\n- Repeated *n* -grams in the `context_history_size` preceding tokens are not watermarked to improve detectability.\n\nNo additional training is required to generate text with a SynthID Text\nwatermark using your models, only a\n[watermarking configuration](https://huggingface.co/docs/transformers/v4.46.0/en/internal/generation_utils#transformers.SynthIDTextWatermarkingConfig) that gets passed to the model's\n`.generate()` method to activate the SynthID Text\n[logits processor](https://huggingface.co/docs/transformers/v4.46.0/en/internal/generation_utils#transformers.SynthIDTextWatermarkLogitsProcessor). See the\n[blog post](https://huggingface.co/blog/synthid-text) and [Space](https://huggingface.co/spaces/google/synthid-text) for code examples\nshowing how to apply a watermark in the Transformers library.\n\nWatermark detection and verifiability\n-------------------------------------\n\nWatermark detection is probabilistic. A Bayesian detector is provided with\n[Hugging Face Transformers](https://huggingface.co/docs/transformers/v4.46.0/en/internal/generation_utils#transformers.SynthIDTextWatermarkDetector) and on\n[GitHub](https://github.com/google-deepmind/synthid-text). This detector can output three possible detection\nstates---watermarked, not watermarked, or uncertain---and the behavior\ncan be customized by setting two threshold values to achieve a specific false\npositive and false negative rate. See Appendix C.8 in the\n[paper](https://www.nature.com/articles/s41586-024-08025-4) for more details.\n\nModels that use the same tokenizer can also share watermarking configuration and\ndetector, thus sharing a common watermark, so long as the detector's training\nset includes examples from all models that share the watermark.\n\nOnce you have a trained detector, you have a choice in if and how you expose it\nto your users, and the public more generally.\n\n- The **fully-private** option does not release or expose the detector in any way.\n- The **semi-private** option does not release the detector, but does expose it through an API.\n- The **public** option releases the detector for others to download and use.\n\nYou and your organization need to decide which detection verification approach\nis best for your needs, based on your ability to support the associated\ninfrastructure and processes.\n\nLimitations\n-----------\n\nSynthID Text watermarks are robust to some transformations---cropping pieces\nof text, modifying a few words, or mild paraphrasing---but this method does\nhave limitations.\n\n- Watermark application is less effective on factual responses, as there is less opportunity to augment generation without decreasing accuracy.\n- Detector confidence scores can be greatly reduced when an AI-generated text is thoroughly rewritten, or translated to another language.\n\nSynthID Text is not designed to directly stop motivated adversaries from causing\nharm. However, it can make it harder to use AI-generated content for malicious\npurposes, and it can be combined with other approaches to give better coverage\nacross content types and platforms."]]