ArXiv TLDR

Estimate Level Adjustment For Inference With Proxies Under Random Distribution Shifts

🐦 Tweet
2605.06484

Steven Wilkins-Reeves, Alexandra N. M. Darmon, Deeksha Sinha

stat.MEcs.LGstat.ML

TLDR

A new estimate-level framework calibrates proxy-based inference under random distribution shifts, avoiding strict assumptions and individual-level data.

Key contributions

  • Introduces an estimate-level framework for calibrating proxy-based inference under random distribution shifts.
  • Models proxy-primary metric discrepancy as a random effect, estimated from aggregated historical data.
  • Avoids strict identifying assumptions and the need for retaining individual-level response data.
  • Can be layered on existing proxy-correction methods and includes uncertainty management for limited data.

Why it matters

Current proxy-based inference methods often rely on strict, hard-to-validate assumptions, leading to biased estimates. This new framework offers a more robust and flexible approach by empirically calibrating proxy inference. It improves the accuracy and reliability of statistical inferences, especially under distribution shifts, without needing individual-level data.

Original Abstract

In many scientific domains, including experimentation, researchers rely on measurements of proxy outcomes to achieve faster and more frequent reads, especially when the primary outcome of interest is challenging to measure directly. While proxies offer a more readily accessible observation for inference, the ultimate goal is to draw statistical inferences about the primary outcome parameter and proxy data are typically imperfect in some ways. To correct for these imperfections, current statistical inference methods often depend on strict identifying assumptions (such as surrogacy, covariate/label shift, or missingness assumptions). These assumptions can be difficult to validate and may be violated by various additional sources of distribution shift, potentially leading to biased parameter estimates and miscalibrated uncertainty quantification. We introduce an estimate-level framework, inspired by domain adaptation techniques, to empirically calibrate proxy-based inference. This framework models the proxy-primary metric discrepancy as a random effect at the parameter level, estimating its distribution from aggregated historical observations across past domains (e.g., experiments, time periods, or distinct segments). This method avoids the requirement for retaining individual-level response data. Additionally, this adjustment can be layered on top of existing proxy-correction methods (such as prediction-powered inference or importance weighting) to account for additional biases not addressed by those corrections. To manage uncertainty when the number of historical domains is limited, we provide both a method-of-moments estimator and a domain bootstrap procedure. We further validate this approach using publicly available datasets and real-world experiments.

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.