Block-wise Codeword Embedding for Reliable Multi-bit Text Watermarking

arXiv cs.CL / 5/4/2026

📰 NewsIdeas & Deep AnalysisModels & Research

共有:

Key Points

The paper argues that existing multi-bit text watermarking approaches for LLMs overemphasize capacity and improperly conflate decoding with detection, leading to unreliable verification.
It shows that ECC-based watermark extractors can produce catastrophic false positive rates, and that simple rejection-threshold tweaks often reduce true positive rates to near-random guessing.
The authors propose BREW (Block-wise Reliable Embedding for Watermarking), a framework based on “designated verification” rather than decoding-centric extraction.
BREW uses a two-stage process: blind message estimation via independent block voting and window-shifting verification that checks the embedded payload against local edits.
Experiments report strong reliability (TPR 0.965, FPR 0.02) under up to 10% synonym substitution and claim the approach is model-agnostic and scalable for forensic use.

Abstract

Recent multi-bit watermarking methods for large language models (LLMs) prioritize capacity over reliability, often conflating decoding with detection. Our analysis reveals that existing ECC-based extractors suffer from catastrophic false positive rates (FPR), and applying rejection thresholds merely collapses detection sensitivity (TPR) to random guessing. To resolve this structural limitation, we propose \textbf{BREW} (Block-wise Reliable Embedding for Watermarking), a framework shifting the paradigm to \emph{designated verification}. BREW employs a two-stage mechanism: (i) \textbf{blind message estimation} via independent block voting, followed by (ii) \textbf{window-shifting verification} that rigorously validates the payload against local edits. Experiments demonstrate that BREW achieves a TPR of 0.965 with an FPR of 0.02 under 10\% synonym substitution, demonstrating that the high-FPR issue is not an inherent trade-off of multi-bit watermarking, but a solvable structural flaw of prior decoding-centric designs. Our framework is model-agnostic and theoretically grounded, providing a scalable solution for reliable forensic deployment.

When Claims Freeze Because a Provider Record Drifted: The Case for Enrollment Repair Agents

Dev.to

The Cash Is Already Earned: Why Construction Pay Application Exceptions Fit an Agent Better Than SaaS

Dev.to

Why Ship-and-Debit Claim Recovery Is a Better Agent Wedge Than Another “AI Back Office” Tool

Dev.to

AI is getting better at doing things, but still bad at deciding what to do?

Reddit r/artificial

I Built an AI-Powered Chinese BaZi (八字) Fortune Teller — Here's What DeepSeek Revealed About Destiny

Dev.to

Block-wise Codeword Embedding for Reliable Multi-bit Text Watermarking

Key Points

Abstract

Related Articles

When Claims Freeze Because a Provider Record Drifted: The Case for Enrollment Repair Agents

The Cash Is Already Earned: Why Construction Pay Application Exceptions Fit an Agent Better Than SaaS

Why Ship-and-Debit Claim Recovery Is a Better Agent Wedge Than Another “AI Back Office” Tool

AI is getting better at doing things, but still bad at deciding what to do?

I Built an AI-Powered Chinese BaZi (八字) Fortune Teller — Here's What DeepSeek Revealed About Destiny

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer