ArXiv TLDR

DARTS: Targeting Prognostic Covariates in Budget-Constrained Sequential Experiments

🐦 Tweet
2605.06608

Kateryna Husar, Alexander Volfovsky

stat.MLcs.LGstat.ME

TLDR

DARTS is a new method for budget-constrained sequential experiments that adaptively selects prognostic covariates to improve treatment effect estimation efficiency.

Key contributions

  • Introduces DARTS, an adaptive method for selecting prognostic covariates under budget constraints.
  • Employs a Thompson sampler to learn and acquire the most informative covariates sequentially.
  • Integrates selected covariates with rerandomization and regression adjustment to boost ATE precision.
  • Provides theoretical guarantees for inferential validity and asymptotic coverage despite adaptive selection.

Why it matters

DARTS addresses costly data in RCTs by adaptively selecting prognostic covariates. This boosts treatment effect estimation efficiency while preserving inferential validity, enabling robust causal inference under budget constraints.

Original Abstract

Randomized controlled trials typically assume that prognostic covariates are known and available at no cost. In practice, obtaining high-dimensional pretreatment data is costly, forcing a trade-off between covariate-adaptive precision and a measurement budget. We introduce Dynamic Adaptive Rerandomization via Thompson Sampling (DARTS), which treats covariate acquisition as a sequential optimization problem embedded within a design-based causal inference task. A budgeted combinatorial Thompson sampler learns which covariates are most prognostic across successive batches; selected covariates then drive rerandomization and regression adjustment to reduce batch-level average treatment effect variance. Our primary theoretical contribution is a decoupling result: adaptive covariate selection based on past batches preserves batch-level randomization validity, and the cumulative inverse-variance weighted estimator achieves at least nominal asymptotic coverage. We further derive a Bayes risk bound for the acquisition layer that matches the minimax lower bound up to logarithmic factors. Empirically, DARTS systematically concentrates the budget on informative features, significantly closing the efficiency gap to oracle designs while maintaining strict inferential validity.

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.