Integrating Diagnostic Checks into Estimation
TLDR
A new method integrates diagnostic checks into estimation via residualization, improving inference, reducing variance, and minimizing bias.
Key contributions
- Eliminates inference distortions caused by check-based selective reporting.
- Reduces estimator variance without altering the estimand when the baseline model is correctly specified.
- Minimizes worst-case bias under bounded local misspecification within linear adjustments.
Why it matters
Empirical researchers often use diagnostic checks, but they are typically external to estimation. This paper offers a "free lunch" by integrating these checks, leading to more robust and efficient estimates. It provides a significant methodological upgrade for various statistical designs.
Original Abstract
Empirical researchers often use diagnostic checks to assess the plausibility of their modeling assumptions, such as testing for covariate balance in RCTs, pre-trends in event studies, or instrument validity in IV designs. While these checks are traditionally treated as external hurdles to estimation, we argue they should be integrated into the estimation process itself. In particular, we propose residualizing one's baseline estimator against the vector of diagnostic check statistics to remove the component of baseline sampling variation explained by the diagnostic checks. This residualized estimator offers researchers a "free lunch," delivering three properties simultaneously: (i) eliminating inference distortions from check-based selective reporting; (ii) reducing variance without changing the estimand when the baseline model is correctly specified; and (iii) minimizing worst-case bias under bounded local misspecification within the class of linear adjustments. We apply our method to the RCT in Kaur et al. (2024) and find that, even in a setting where all balance checks pass comfortably, residualization increases the magnitude of the baseline point estimate and reduces its standard error, equivalent to approximately a 10% increase in sample size.
📬 Weekly AI Paper Digest
Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.