Statistical Machine Learning

Statistical approaches to machine learning, Bayesian methods, and theoretical foundations.

stat.ML · 377 papers

What is Learnable in Valiant's Theory of the Learnable?

This paper characterizes learnability in Valiant's original model, showing membership queries expand learnable classes and providing a new algorithm for halfspaces.

2605.13840May 13, 2026Steve Hanneke, Anay Mehrotra, Grigoris Velegkas +1

Pion: A Spectrum-Preserving Optimizer via Orthogonal Equivalence Transformation

Pion is a novel spectrum-preserving optimizer for LLMs that uses orthogonal transformations to maintain singular values throughout training.

2605.12492May 12, 2026Kexuan Shi, Hanxuan Li, Zeju Qiu +3

A proximal gradient algorithm for composite log-concave sampling

A new proximal gradient algorithm efficiently samples from composite log-concave distributions, matching state-of-the-art for specific cases and extending to broader settings.

2605.12461May 12, 2026Linghai Liu, Sinho Chewi

Model-based Bootstrap of Controlled Markov Chains

This paper proposes a model-based bootstrap for controlled Markov chains in offline RL, yielding consistent estimators and valid confidence intervals.

2605.12410May 12, 2026Ziwei Su, Imon Banerjee, Diego Klabjan

Multi-Variable Conformal Prediction: Optimizing Prediction Sets without Data Splitting

Multi-Variable Conformal Prediction (MCP) optimizes prediction sets by extending conformal prediction to vector-valued scores, eliminating data splitting.

2605.12341May 12, 2026Laura Lützow, Simone Garatti, Marco C. Campi +2

Optimal Policy Learning under Budget and Coverage Constraints

This paper characterizes optimal policy learning under budget and coverage constraints, showing a knapsack structure and near-optimal algorithms.

2605.12235May 12, 2026Giovanni Cerulli

Self-Supervised Laplace Approximation for Bayesian Uncertainty Quantification

Introduces Self-Supervised Laplace Approximation (SSLA) to quantify Bayesian model predictive uncertainty by refitting on self-predicted data, outperforming classical methods.

2605.12208May 12, 2026Julian Rodemann, Alexander Marquard, Thomas Augustin +1

Approximation Theory of Laplacian-Based Neural Operators for Reaction-Diffusion System

This paper shows Laplacian-based neural operators efficiently approximate reaction-diffusion systems with polynomial complexity.

2605.12025May 12, 2026Takashi Furuya, Ryo Ozawa, Jenn-Nan Wang

Random-Set Graph Neural Networks

This paper introduces Random-Set Graph Neural Networks (RS-GNNs) to model node-level epistemic uncertainty using belief functions for improved predictions.

2605.11987May 12, 2026Tommy Woodley, Shireen Kudukkil Manchingal, Matteo Tolloso +2

QDSB: Quantized Diffusion Schrödinger Bridges

QDSB introduces quantized diffusion Schrödinger bridges to efficiently learn generative models from unpaired data, significantly reducing training time.

2605.11983May 12, 2026Tobias Fuchs, Florian Kalinke, Nadja Klein

LOFT: Low-Rank Orthogonal Fine-Tuning via Task-Aware Support Selection

LOFT is a low-rank orthogonal fine-tuning framework that separates adaptation subspace and transformation, improving PEFT efficiency via task-aware support selection.

2605.11872May 12, 2026Lanxin Zhao, Bamdev Mishra, Pratik Jawanpuria +4

One-Step Generative Modeling via Wasserstein Gradient Flows

W-Flow introduces a novel one-step generative model using Wasserstein gradient flows, achieving state-of-the-art image generation 100x faster than diffusion models.

2605.11755May 12, 2026Jiaqi Han, Puheng Li, Qiushan Guo +3

Learning U-Statistics with Active Inference

An active inference framework for U-statistics improves estimation efficiency by selectively querying informative labels under budget constraints.

2605.11638May 12, 2026Xiaoning Wang, Yuyang Huo, Liuhua Peng +1

Exact Stiefel Optimization for Probabilistic PLS: Closed-Form Updates, Error Bounds, and Calibrated Uncertainty

Introduces an end-to-end framework for Probabilistic PLS using exact Stiefel optimization, offering calibrated uncertainty and improved accuracy.

2605.11607May 12, 2026Haoran Hu, Xingce Wang

A Composite Activation Function for Learning Stable Binary Representations

Introduces HTAF, a smooth composite activation function enabling stable gradient-based training of neural networks with binary representations.

2605.11558May 12, 2026Seokhun Park, Choeun Kim, Kwanho Lee +3

Variational Inference for Lévy Process-Driven SDEs via Neural Tilting

This paper introduces a neural exponential tilting framework for variational inference in Lévy-driven SDEs, addressing challenges in modeling extreme events.

2605.10934May 11, 2026Yaman Kindap, Manfred Opper, Benjamin Dupuis +2

Reasoning Is Not Free: Robust Adaptive Cost-Efficient Routing for LLM-as-a-Judge

RACER dynamically routes between reasoning and non-reasoning LLM judges to optimize accuracy and cost, especially under distribution shift.

2605.10805May 11, 2026Wenbo Zhang, Lijinghua Zhang, Liner Xiang +1

Factual recall in linear associative memories: sharp asymptotics and mechanistic insights

This paper precisely characterizes the factual storage capacity of linear associative memories using statistical physics, offering insights into optimal learning.

2605.10795May 11, 2026Alessio Giorlandino, Sebastian Goldt, Antoine Maillard

When Are Trade-Off Functions Testable from Finite Samples?

This paper identifies conditions under which trade-off functions for binary testing are testable from finite samples, crucial for statistical inference.

2605.10774May 11, 2026Kaining Shi, Qiaosen Wang, Cong Ma

What should post-training optimize? A test-time scaling law perspective

This paper proposes Tail-Extrapolated estimators (TEA, Prefix-TEA) to optimize LLM post-training for best-of-N deployment, even with limited training rollouts.

2605.10716May 11, 2026Muheng Li, Jian Qian, Wenlong Mou

Page 1 of 19Next

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.