ArXiv TLDR

Covariate Balancing and Riesz Regression Should Be Guided by the Neyman Orthogonal Score in Debiased Machine Learning

🐦 Tweet
2605.06386

Masahiro Kato

econ.EMcs.LGmath.STstat.MEstat.ML

TLDR

This paper argues that debiased machine learning should use regressor balancing guided by the Neyman orthogonal score, not just covariate balancing.

Key contributions

  • DML balancing functions should be derived from the Neyman orthogonal score, not only covariates.
  • Covariate balancing is a special case, appropriate when regression error depends solely on covariates.
  • Proposes "regressor balancing" via Riesz regression with (D,Z) basis functions for general DML.
  • This method addresses treatment-specific components of the score error that covariate balancing misses.

Why it matters

This paper provides a more principled and robust approach to balancing in debiased machine learning. It clarifies the limitations of traditional covariate balancing and offers a general solution for heterogeneous treatment effects, improving causal inference.

Original Abstract

This position paper argues that, in debiased machine learning, balancing functions should be derived from the Neyman orthogonal score, not chosen only as functions of covariates. Covariate balancing is effective when the regression error entering the score can be represented by functions of covariates alone, and it is the natural finite-dimensional approximation for targets such as ATT counterfactual means. For ATE estimation under treatment effect heterogeneity, however, the score error generally contains treatment-specific components because the outcome regression is a function of the full regressor $X=(D,Z)$. In that case, balancing common functions of $Z$ can leave the treatment-specific component unbalanced. We therefore advocate regressor balancing, implemented by Riesz regression with basis functions of $X$, as the general balancing principle for DML. The position is not that covariate balancing is invalid, but that covariate balancing should be understood as the special case that is appropriate when the score-relevant regression error is a function of covariates alone.

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.