ArXiv TLDR

Tail allocation for conformal prediction intervals

🐦 Tweet
2604.25202

Tianying Wang

stat.MEmath.STstat.ML

TLDR

Introduces TA-CQR, a method for constructing shortest single-interval conformal prediction intervals with exact marginal coverage by optimizing tail allocation.

Key contributions

  • Proposes Tail-Allocation Conformalized Quantile Regression (TA-CQR) for shortest single-interval prediction.
  • Optimizes lower-tail allocation by searching quantile-defined cores and applying split-conformal calibration.
  • Achieves exact finite-sample marginal coverage under exchangeability.
  • Provides theoretical characterization of oracle geometry and asymptotic properties of calibration radii.

Why it matters

This paper addresses the practical constraint of single-interval prediction sets in regression. TA-CQR provides a method to achieve the shortest possible intervals with exact marginal coverage, advancing the utility and theoretical understanding of conformal prediction for real-world applications.

Original Abstract

We study split-conformal prediction for regression when the reported prediction set must be a single interval, at target marginal coverage $1-α$, where $α$ is the nominal miscoverage level. Under this reporting constraint, the natural conditional target is the shortest interval with conditional mass at least $1-α$, rather than an equal-tailed interval or a possibly disconnected high-probability set. We parameterize this single-interval oracle by a lower-tail allocation, which determines how the nominal miscoverage $α$ is split between the two endpoints, and propose tail-allocation conformalized quantile regression (TA-CQR). TA-CQR estimates this allocation by searching over quantile-defined cores and then applies nonnegative additive split-conformal calibration, retaining exact finite-sample marginal coverage under exchangeability. The main contribution is theoretical. We characterize the oracle geometry, including its highest-density interpretation under unimodality and the positive connectedness cost induced by disconnected highest-density sets. We prove local recovery of the selected allocation and core, establish that calibration radii are asymptotically negligible under endpoint-density conditions, and give a finite-sample calibrated length oracle inequality with explicit grid, endpoint-quantile estimation, and calibration-sampling terms. Simulations and real-data examples report coverage and length jointly.

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.