Bootstrapping with AI/ML-generated labels
Timothy Christensen, Silvia Goncalves, Benoit Perron
TLDR
This paper introduces a coupled-label bootstrap method to correct OLS bias and validate inference when using AI/ML-generated labels with misclassification errors.
Key contributions
- Shows fixed-label bootstrap is invalid for AI/ML-generated labels in regressions.
- Proposes a coupled-label bootstrap for valid inference, even with misclassification.
- Introduces variance correction for misclassification rate uncertainty.
- Suggests Hessian rotation to improve coverage in near-singular designs.
Why it matters
AI/ML-generated labels bias OLS estimators in regressions, invalidating standard inference. This paper introduces a robust "coupled-label bootstrap" to correct these biases, ensuring reliable research with practical adjustments for improved coverage.
Original Abstract
AI/ML methods are increasingly used in economics to generate binary variables (or labels) via classification algorithms. When these generated variables are included as covariates in regressions, even small misclassification errors can induce large biases in OLS estimators and invalidate standard inference. We study whether the bootstrap can correct this bias and deliver valid inference. We first show that a seemingly natural fixed-label bootstrap, which generates data using estimated labels but relies on a corrupted version in estimation, is generally invalid unless a strong independence condition between the latent true labels and other covariates holds. We then propose a coupled-label bootstrap that jointly resamples the true and imputed labels, and show it is valid without this condition. Two finite-sample adjustments further improve coverage: a variance correction for uncertainty in estimated misclassification rates and a Hessian rotation for near-singular designs. We illustrate the methods in simulations and apply them to investigate the relationship between wages and remote work status.
📬 Weekly AI Paper Digest
Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.