ArXiv TLDR

Unified Precision-Guaranteed Stopping Rules for Contextual Learning

🐦 Tweet
2604.07913

Mingrui Ding, Qiuhong Zhao, Siyang Gao, Jing Dong

math.OCstat.ML

TLDR

This paper introduces unified, precision-guaranteed stopping rules for contextual learning, enabling efficient data collection with assured policy accuracy.

Key contributions

  • Introduces unified stopping rules for contextual learning with unknown sampling variances.
  • Applies to both unstructured and structured linear settings, using GLR statistics.
  • Derives new time-uniform deviation inequalities for self-normalized GLR evidence.
  • Achieves target precision with significantly fewer samples than existing methods.

Why it matters

This framework offers a practical solution for determining sufficient data collection in personalized decision-making. It significantly reduces unnecessary sampling across diverse environments like historical datasets, simulations, and real systems, allowing practitioners to maintain high decision quality more efficiently.

Original Abstract

Contextual learning seeks to learn a decision policy that maps an individual's characteristics to an action through data collection. In operations management, such data may come from various sources, and a central question is when data collection can stop while still guaranteeing that the learned policy is sufficiently accurate. We study this question under two precision criteria: a context-wise criterion and an aggregate policy-value criterion. We develop unified stopping rules for contextual learning with unknown sampling variances in both unstructured and structured linear settings. Our approach is based on generalized likelihood ratio (GLR) statistics for pairwise action comparisons. To calibrate the corresponding sequential boundaries, we derive new time-uniform deviation inequalities that directly control the self-normalized GLR evidence and thus avoid the conservativeness caused by decoupling mean and variance uncertainty. Under the Gaussian sampling model, we establish finite-sample precision guarantees for both criteria. Numerical experiments on synthetic instances and two case studies demonstrate that the proposed stopping rules achieve the target precision with substantially fewer samples than benchmark methods. The proposed framework provides a practical way to determine when enough information has been collected in personalized decision problems. It applies across multiple data-collection environments, including historical datasets, simulation models, and real systems, enabling practitioners to reduce unnecessary sampling while maintaining a desired level of decision quality.

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.