ArXiv TLDR

Block-wise Codeword Embedding for Reliable Multi-bit Text Watermarking

🐦 Tweet
2605.00348

Joeun Kim, HoEun Kim, Dongsup Jin, Young-Sik Kim

cs.CRcs.CL

TLDR

BREW is a novel LLM watermarking framework that achieves high reliability and low false positives through a two-stage designated verification process.

Key contributions

  • Resolves high false positive rates (FPR) and poor detection sensitivity (TPR) in multi-bit LLM watermarking.
  • Introduces BREW, a framework for "designated verification" to improve reliability over decoding-centric methods.
  • Employs a two-stage mechanism: blind message estimation via block voting and window-shifting verification.
  • Achieves 0.965 TPR and 0.02 FPR under 10% synonym substitution, demonstrating a solvable structural flaw.

Why it matters

This paper introduces BREW, a significant advancement in multi-bit LLM watermarking. It solves the critical problem of high false positive rates and low detection sensitivity that plagued previous methods. By shifting to a designated verification paradigm, BREW offers a reliable and scalable solution for forensic deployment of LLM watermarks.

Original Abstract

Recent multi-bit watermarking methods for large language models (LLMs) prioritize capacity over reliability, often conflating decoding with detection. Our analysis reveals that existing ECC-based extractors suffer from catastrophic false positive rates (FPR), and applying rejection thresholds merely collapses detection sensitivity (TPR) to random guessing. To resolve this structural limitation, we propose \textbf{BREW} (Block-wise Reliable Embedding for Watermarking), a framework shifting the paradigm to \emph{designated verification}. BREW employs a two-stage mechanism: (i) \textbf{blind message estimation} via independent block voting, followed by (ii) \textbf{window-shifting verification} that rigorously validates the payload against local edits. Experiments demonstrate that BREW achieves a TPR of 0.965 with an FPR of 0.02 under 10\% synonym substitution, demonstrating that the high-FPR issue is not an inherent trade-off of multi-bit watermarking, but a solvable structural flaw of prior decoding-centric designs. Our framework is model-agnostic and theoretically grounded, providing a scalable solution for reliable forensic deployment.

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.