QuantileMark: A Message-Symmetric Multi-bit Watermark for LLMs

arXiv cs.CL / 4/16/2026

📰 NewsSignals & Early TrendsIdeas & Deep AnalysisModels & Research

共有:

Key Points

QuantileMark is presented as a white-box, message-symmetric multi-bit watermarking method for LLMs that aims to avoid message-dependent changes to text quality or verification outcomes.
The approach embeds bits by sampling from fixed-probability (equal-mass) quantile bins within the continuous cumulative probability interval, maintaining a constant 1/M probability budget across different decoding entropies.
For detection, the verifier reproduces the bin partition via teacher forcing, estimates posteriors over latent bins, and aggregates evidence to determine whether the watermark is present.
The paper proves message-unbiasedness (averaging over messages recovers the base distribution) and argues that the equal-mass bin design yields more uniform detection evidence strength across messages.
Experiments on tasks like C4 continuation and LFQA report improved multi-bit recovery and stronger detection robustness with negligible impact on generation quality, and the authors release accompanying code on GitHub.

Abstract

As large language models become standard backends for content generation, practical provenance increasingly requires multi-bit watermarking. In provider-internal deployments, a key requirement is message symmetry: the message itself should not systematically affect either text quality or verification outcomes. Vocabulary-partition watermarks can break message symmetry in low-entropy decoding: some messages are assigned most of the probability mass, while others are forced to use tail tokens. This makes embedding quality and message decoding accuracy message-dependent. We propose QuantileMark, a white-box multi-bit watermark that embeds messages within the continuous cumulative probability interval

[0, 1)

. At each step, QuantileMark partitions this interval into

M

equal-mass bins and samples strictly from the bin assigned to the target symbol, ensuring a fixed

1/M

probability budget regardless of context entropy. For detection, the verifier reconstructs the same partition under teacher forcing, computes posteriors over latent bins, and aggregates evidence for verification. We prove message-unbiasedness, a property ensuring that the base distribution is recovered when averaging over messages. This provides a theoretical foundation for generation-side symmetry, while the equal-mass design additionally promotes uniform evidence strength across messages on the detection side. Empirical results on C4 continuation and LFQA show improved multi-bit recovery and detection robustness over strong baselines, with negligible impact on generation quality. Our code is available at GitHub (https://github.com/zzzjunlin/QuantileMark).