ArXiv TLDR

Combating Data Laundering in LLM Training

🐦 Tweet
2604.01904

Muxing Li, Zesheng Ye, Sharon Li, Feng Liu

cs.CRcs.AI

TLDR

This paper introduces Synthesis Data Reversion (SDR) to effectively detect laundered proprietary data used in LLM training, even when original detection signals are erased.

Key contributions

  • Addresses data laundering, which transforms proprietary data to evade detection in LLM training.
  • Infers unknown laundering transformations using black-box LLM access and an auxiliary LLM.
  • Introduces Synthesis Data Reversion (SDR) to abstract and refine transformation goals for query synthesis.
  • SDR consistently strengthens data misuse detection across various LLMs and laundering practices.

Why it matters

Data laundering poses a significant threat to data rights owners seeking to protect their intellectual property from unauthorized LLM training. This work provides a crucial, practical countermeasure, enabling more robust detection of data misuse. It helps ensure fair use and accountability in the rapidly evolving LLM ecosystem.

Original Abstract

Data rights owners can detect unauthorized data use in large language model (LLM) training by querying with proprietary samples. Often, superior performance (e.g., higher confidence or lower loss) on a sample relative to the untrained data implies it was part of the training corpus, as LLMs tend to perform better on data they have seen during training. However, this detection becomes fragile under data laundering, a practice of transforming the stylistic form of proprietary data, while preserving critical information to obfuscate data provenance. When an LLM is trained exclusively on such laundered variants, it no longer performs better on originals, erasing the signals that standard detections rely on. We counter this by inferring the unknown laundering transformation from black-box access to the target LLM and, via an auxiliary LLM, synthesizing queries that mimic the laundered data, even if rights owners have only the originals. As the search space of finding true laundering transformations is infinite, we abstract such a process into a high-level transformation goal (e.g., "lyrical rewriting") and concrete details (e.g., "with vivid imagery"), and introduce synthesis data reversion (SDR) that instantiates this abstraction. SDR first identifies the most probable goal for synthesis to narrow the search; it then iteratively refines details so that synthesized queries gradually elicit stronger detection signals from the target LLM. Evaluated on the MIMIR benchmark against diverse laundering practices and target LLM families (Pythia, Llama2, and Falcon), SDR consistently strengthens data misuse detection, providing a practical countermeasure to data laundering.

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.