ArXiv TLDR

DCR: Counterfactual Attractor Guidance for Rare Compositional Generation

🐦 Tweet
2605.06512

Taewon Kang, Matthias Zwicker

cs.CV

TLDR

DCR is a training-free method that uses counterfactual attractor guidance to prevent diffusion models from collapsing on rare compositional prompts.

Key contributions

  • Identifies "default completion bias" in diffusion models for rare compositional generation.
  • Introduces DCR, a training-free framework that explicitly models and suppresses default completion bias.
  • Constructs a counterfactual attractor and uses projection-based repulsion to prevent collapse to common alternatives.
  • Enhances compositional fidelity for rare prompts while maintaining visual quality, without retraining or architecture changes.

Why it matters

This paper addresses a key limitation of diffusion models: their struggle with rare but plausible compositions. DCR offers a novel, training-free approach to overcome "default completion bias," making diffusion models more versatile and controllable. It also provides insights into intrinsic model biases.

Original Abstract

Diffusion models generate realistic visual content, yet often fail to produce rare but plausible compositions. When prompted with combinations that are valid but underrepresented in training data, such as a snowy beach or a rainbow at night, the generation process frequently collapses toward more common alternatives. We identify this failure mode as default completion bias, where denoising trajectories are implicitly attracted toward high-frequency semantic configurations. Existing guidance mechanisms do not explicitly model this competing tendency and therefore struggle to prevent such collapse. We introduce Default Completion Repulsion (DCR), a training-free framework that explicitly models and suppresses default completion behavior. DCR constructs a counterfactual attractor by relaxing the rare compositional factor while preserving surrounding semantics, inducing an alternative denoising trajectory reflecting the model's preferred completion. We define the discrepancy between target and attractor trajectories as a counterfactual drift, and propose a projection-based repulsion mechanism that removes guidance components aligned with this drift direction. This suppresses undesired frequent completions while preserving other semantic components. DCR operates entirely within the standard diffusion sampling process without retraining or architectural modification. Experiments on rare compositional prompts show that DCR improves compositional fidelity while maintaining visual quality. Our analysis further shows that the framework exposes and counteracts intrinsic model biases, offering a new perspective on controllable generation beyond explicit constraint enforcement.

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.