Statistical Machine Learning
Statistical approaches to machine learning, Bayesian methods, and theoretical foundations.
stat.ML · 377 papersWhat is Learnable in Valiant's Theory of the Learnable?
This paper characterizes learnability in Valiant's original model, showing membership queries expand learnable classes and providing a new algorithm for halfspaces.
Pion: A Spectrum-Preserving Optimizer via Orthogonal Equivalence Transformation
Pion is a novel spectrum-preserving optimizer for LLMs that uses orthogonal transformations to maintain singular values throughout training.
A proximal gradient algorithm for composite log-concave sampling
A new proximal gradient algorithm efficiently samples from composite log-concave distributions, matching state-of-the-art for specific cases and extending to broader settings.
Model-based Bootstrap of Controlled Markov Chains
This paper proposes a model-based bootstrap for controlled Markov chains in offline RL, yielding consistent estimators and valid confidence intervals.
Multi-Variable Conformal Prediction: Optimizing Prediction Sets without Data Splitting
Multi-Variable Conformal Prediction (MCP) optimizes prediction sets by extending conformal prediction to vector-valued scores, eliminating data splitting.
Optimal Policy Learning under Budget and Coverage Constraints
This paper characterizes optimal policy learning under budget and coverage constraints, showing a knapsack structure and near-optimal algorithms.
Self-Supervised Laplace Approximation for Bayesian Uncertainty Quantification
Introduces Self-Supervised Laplace Approximation (SSLA) to quantify Bayesian model predictive uncertainty by refitting on self-predicted data, outperforming classical methods.
Approximation Theory of Laplacian-Based Neural Operators for Reaction-Diffusion System
This paper shows Laplacian-based neural operators efficiently approximate reaction-diffusion systems with polynomial complexity.
Random-Set Graph Neural Networks
This paper introduces Random-Set Graph Neural Networks (RS-GNNs) to model node-level epistemic uncertainty using belief functions for improved predictions.
QDSB: Quantized Diffusion Schrödinger Bridges
QDSB introduces quantized diffusion Schrödinger bridges to efficiently learn generative models from unpaired data, significantly reducing training time.
LOFT: Low-Rank Orthogonal Fine-Tuning via Task-Aware Support Selection
LOFT is a low-rank orthogonal fine-tuning framework that separates adaptation subspace and transformation, improving PEFT efficiency via task-aware support selection.
One-Step Generative Modeling via Wasserstein Gradient Flows
W-Flow introduces a novel one-step generative model using Wasserstein gradient flows, achieving state-of-the-art image generation 100x faster than diffusion models.
Learning U-Statistics with Active Inference
An active inference framework for U-statistics improves estimation efficiency by selectively querying informative labels under budget constraints.
Exact Stiefel Optimization for Probabilistic PLS: Closed-Form Updates, Error Bounds, and Calibrated Uncertainty
Introduces an end-to-end framework for Probabilistic PLS using exact Stiefel optimization, offering calibrated uncertainty and improved accuracy.
A Composite Activation Function for Learning Stable Binary Representations
Introduces HTAF, a smooth composite activation function enabling stable gradient-based training of neural networks with binary representations.
Variational Inference for Lévy Process-Driven SDEs via Neural Tilting
This paper introduces a neural exponential tilting framework for variational inference in Lévy-driven SDEs, addressing challenges in modeling extreme events.
Reasoning Is Not Free: Robust Adaptive Cost-Efficient Routing for LLM-as-a-Judge
RACER dynamically routes between reasoning and non-reasoning LLM judges to optimize accuracy and cost, especially under distribution shift.
Factual recall in linear associative memories: sharp asymptotics and mechanistic insights
This paper precisely characterizes the factual storage capacity of linear associative memories using statistical physics, offering insights into optimal learning.
When Are Trade-Off Functions Testable from Finite Samples?
This paper identifies conditions under which trade-off functions for binary testing are testable from finite samples, crucial for statistical inference.
What should post-training optimize? A test-time scaling law perspective
This paper proposes Tail-Extrapolated estimators (TEA, Prefix-TEA) to optimize LLM post-training for best-of-N deployment, even with limited training rollouts.
📬 Weekly AI Paper Digest
Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.