LangFlow: Continuous Diffusion Rivals Discrete in Language Modeling
Yuxin Chen, Chumeng Liang, Hangke Sui, Ruihan Guo, Chaoran Cheng + 2 more
TLDR
LangFlow introduces the first continuous diffusion language model that rivals discrete counterparts, closing a significant performance gap in language modeling.
Key contributions
- Introduces a novel ODE-based NLL bound for principled evaluation of continuous flow-based language models.
- Proposes an information-uniform noise scheduling principle with a learnable Gumbel distribution scheduler.
- Develops an improved training protocol incorporating self-conditioning for enhanced likelihood and sample quality.
- Achieves strong perplexity, matching discrete DLMs and surpassing autoregressive baselines in zero-shot transfer.
Why it matters
This paper demonstrates that continuous diffusion models can achieve competitive performance in language modeling, a domain where they previously lagged. LangFlow closes this gap, proving continuous diffusion is a promising and viable paradigm for future language model development.
Original Abstract
Continuous diffusion models have achieved strong performance across domains such as images. However, in language modeling, prior continuous diffusion language models (DLMs) lag behind discrete counterparts. In this work, we close this gap with LangFlow, the first continuous DLM to rival discrete diffusion. Our approach connects embedding-space DLMs to Flow Matching via Bregman divergence and introduces three key innovations: (1) a novel ODE-based NLL bound for principled evaluation of continuous flow-based language models; (2) an information-uniform principle for noise scheduling, motivating a learnable scheduler based on a Gumbel distribution; and (3) an improved training protocol incorporating self-conditioning, which enhances both likelihood and sample quality.LangFlow achieves strong performance across benchmarks, reaching a perplexity (PPL) of 30.0 on LM1B and 24.6 on OpenWebText. It matches top discrete DLMs at comparable scale and surpasses autoregressive baselines in zero-shot transfer across multiple benchmarks. LangFlow provides clear evidence that continuous diffusion is a competitive and promising paradigm for language modeling. https://github.com/nealchen2003/LangFlow
📬 Weekly AI Paper Digest
Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.