SemStamp: A Semantic Watermark with Paraphrastic Robustness for Text Generation
ICLR 2024(2024)
Undergrad student | PhD student | Postdoc | Intern | Researcher | Assistant Professor | Associate Professor
Abstract
Existing watermarking algorithms are vulnerable to paraphrase attacks becauseof their token-level design. To address this issue, we propose SemStamp, arobust sentence-level semantic watermarking algorithm based onlocality-sensitive hashing (LSH), which partitions the semantic space ofsentences. The algorithm encodes and LSH-hashes a candidate sentence generatedby an LLM, and conducts sentence-level rejection sampling until the sampledsentence falls in watermarked partitions in the semantic embedding space. Amargin-based constraint is used to enhance its robustness. To show theadvantages of our algorithm, we propose a "bigram" paraphrase attack using theparaphrase that has the fewest bigram overlaps with the original sentence. Thisattack is shown to be effective against the existing token-level watermarkingmethod. Experimental results show that our novel semantic watermark algorithmis not only more robust than the previous state-of-the-art method on bothcommon and bigram paraphrase attacks, but also is better at preserving thequality of generation.
MoreTranslated text
Key words
AI-generated text detection,large language model,natural language watermark,locality-sensitive hashing,paraphrase attack,sentence encoder,contrastive learning
PDF
View via Publisher
AI Read Science
Must-Reading Tree
Example

Generate MRT to find the research sequence of this paper
Data Disclaimer
The page data are from open Internet sources, cooperative publishers and automatic analysis results through AI technology. We do not make any commitments and guarantees for the validity, accuracy, correctness, reliability, completeness and timeliness of the page data. If you have any questions, please contact us by email: report@aminer.cn
Chat Paper
Summary is being generated by the instructions you defined