ArXiv TLDR

Training-Free Probabilistic Time-Series Forecasting with Conformal Seasonal Pools

🐦 Tweet
2605.03789

Valery Manokhin

stat.MLcs.LG

TLDR

A new training-free probabilistic time-series forecaster, CSP, significantly outperforms DeepNPTS in accuracy and speed, crucial for critical applications.

Key contributions

  • Proposes Conformal Seasonal Pools (CSP), a training-free probabilistic time-series forecaster.
  • Outperforms DeepNPTS on 6 datasets across multiple metrics (CRPS, quantile loss, coverage) and is 500x faster.
  • Achieves superior 95% coverage (0.89 vs 0.66 for DeepNPTS), addressing critical calibration failures.
  • Requires no learned parameters or training, simplifying deployment and increasing robustness.

Why it matters

This paper introduces a robust, training-free time-series forecasting method that addresses critical calibration failures seen in existing models like DeepNPTS. Its superior accuracy and speed are vital for safety- and decision-critical applications such as healthcare, finance, and energy operations, where reliable prediction intervals are paramount.

Original Abstract

We propose Conformal Seasonal Pools (CSP), a training-free probabilistic time-series forecaster that mixes same-season empirical draws with signed residual draws around a seasonal naive forecast. In an audited rolling-origin benchmark on the six time-series datasets where DeepNPTS was originally evaluated (electricity, exchange_rate, solar_energy, taxi, traffic, wikipedia), CSP-Adaptive significantly outperforms DeepNPTS on every metric we report -- CRPS (per-window paired Wilcoxon $p \approx 4 \times 10^{-10}$), normalized mean quantile loss ($p \approx 7 \times 10^{-10}$), and empirical 95% coverage ($p \approx 8 \times 10^{-45}$, mean 0.89 vs 0.66) -- while running over 500x faster on CPU. Coverage is the most decision-critical of these: a 0.95 nominal interval that contains the truth in only ~66% of cases fails the basic calibration desideratum and would not survive deployment in safety- or decision-critical settings. The failure mode is also more severe than aggregate coverage suggests: in the worst 10% of windows, DeepNPTS's prediction interval covers none of the H forecast horizons -- the entire multi-step trajectory misses the truth at every step simultaneously. This poses serious risk in safety- and decision-critical applications such as healthcare, finance, energy operations, and autonomous systems, where prediction intervals that systematically miss the truth across the entire planning horizon translate directly into misclassified patients, regulatory capital failures, grid imbalances, and safety-case violations. CSP achieves all of this with no learned parameters and no training. We argue training-free conformal samplers should be mandatory baselines when evaluating learned non-parametric forecasters.

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.