Causal Representation Learning from General Environments under Nonparametric Mixing
Ignavier Ng, Shaoan Xie, Xinshuai Dong, Peter Spirtes, Kun Zhang
TLDR
A new method recovers latent causal DAGs and variables from general environments with nonparametric mixing, relaxing common restrictive assumptions.
Key contributions
- Formalizes desiderata for causal representation learning in "general environments."
- Recovers latent causal DAGs and variables under nonparametric mixing and nonlinear models.
- Leverages sufficient change conditions on causal mechanisms up to third-order derivatives.
- First to fully recover latent DAGs from general environments with nonparametric mixing.
Why it matters
Existing causal representation learning methods often rely on unrealistic assumptions. This paper introduces a breakthrough, enabling full recovery of latent causal structures under much more general and realistic conditions. This significantly broadens the applicability of causal representation learning to real-world problems.
Original Abstract
Causal representation learning aims to recover the latent causal variables and their causal relations, typically represented by directed acyclic graphs (DAGs), from low-level observations such as image pixels. A prevailing line of research exploits multiple environments, which assume how data distributions change, including single-node interventions, coupled interventions, or hard interventions, or parametric constraints on the mixing function or the latent causal model, such as linearity. Despite the novelty and elegance of the results, they are often violated in real problems. Accordingly, we formalize a set of desiderata for causal representation learning that applies to a broader class of environments, referred to as general environments. Interestingly, we show that one can fully recover the latent DAG and identify the latent variables up to minor indeterminacies under a nonparametric mixing function and nonlinear latent causal models, such as additive (Gaussian) noise models or heteroscedastic noise models, by properly leveraging sufficient change conditions on the causal mechanisms up to third-order derivatives. These represent, to our knowledge, the first results to fully recover the latent DAG from general environments under nonparametric mixing. Notably, our results match or improve upon many existing works, but require less restrictive assumptions about changing environments.
📬 Weekly AI Paper Digest
Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.