ArXiv TLDR

Causality-Encoded Diffusion Models for Interventional Sampling and Edge Inference

🐦 Tweet
2604.21843

Li Chen, Xiaotong Shen, Wei Pan

stat.MEstat.ML

TLDR

This paper introduces causality-encoded diffusion models for interventional sampling and directed edge inference, improving causal analysis.

Key contributions

  • Proposes causality-encoded diffusion models that integrate DAGs for causal analysis.
  • Enables interventional sampling by fixing variables and propagating causal effects.
  • Develops a resampling-based test for directed edges with strong theoretical guarantees.
  • Demonstrates improved interventional recovery and practical utility in flow cytometry data.

Why it matters

This work addresses a key limitation of standard diffusion models by enabling causal analysis. It provides a robust framework for interventional sampling and edge inference, crucial for understanding complex systems. This advances the utility of diffusion models in scientific discovery.

Original Abstract

Standard diffusion models are flexible estimators of complex distributions, but they do not encode causal structures and therefore do not by themselves support causal analysis. We propose a causality-encoded diffusion framework that incorporates a known directed acyclic graph by training conditional diffusion models consistent with the graph factorisation. The resulting sampler approximately recovers the observational distribution and enables interventional sampling by fixing intervened variables while propagating effects through the graph during reverse diffusion. Building on this interventional simulator, we develop a resampling-based test for directed edges that generates null replicates under a candidate graph. We establish convergence guarantees for observational and interventional distribution estimation, with rates governed by the maximum local dimension rather than the ambient dimension, and prove asymptotic control of type I error for the edge test. Simulations show improved interventional distribution recovery relative to baselines, with near-nominal size and favourable power in inference. An application to flow cytometry data demonstrates practical utility of the proposed method in assessing disputed signalling linkages.

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.