ArXiv TLDR
← All categories

Machine Learning

Papers on learning algorithms, neural networks, deep learning, and optimization.

cs.LG · 1348 papers

EVA-Bench: A New End-to-end Framework for Evaluating Voice Agents

EVA-Bench is a new end-to-end framework for evaluating voice agents using realistic bot-to-bot audio simulations and novel composite metrics.

2605.13841May 13, 2026Tara Bogavelli, Gabrielle Gauthier Melançon, Katrina Stankiewicz +10

What is Learnable in Valiant's Theory of the Learnable?

This paper characterizes learnability in Valiant's original model, showing membership queries expand learnable classes and providing a new algorithm for halfspaces.

2605.13840May 13, 2026Steve Hanneke, Anay Mehrotra, Grigoris Velegkas +1

R-DMesh: Video-Guided 3D Animation via Rectified Dynamic Mesh Flow

R-DMesh solves pose misalignment in video-guided 3D animation using a novel VAE and rectification offset for high-fidelity 4D mesh generation.

2605.13838May 13, 2026Zijie Wu, Lixin Xu, Puhua Jiang +3

Topology-Preserving Neural Operator Learning via Hodge Decomposition

This paper introduces a topology-preserving neural operator learning method using Hodge decomposition to model physical field equations on geometric meshes.

2605.13834May 13, 2026Dongzhe Zheng, Tao Zhong, Christine Allen-Blanchette

QLAM: A Quantum Long-Attention Memory Approach to Long-Sequence Token Modeling

QLAM introduces a quantum long-attention memory, extending state-space models to efficiently capture long-range dependencies using quantum superposition.

2605.13833May 13, 2026Hoang-Quan Nguyen, Sankalp Pandey, Khoa Luu

Quantifying Sensitivity for Tree Ensembles: A symbolic and compositional approach

This paper introduces a novel symbolic and compositional method to quantify sensitivity in decision tree ensembles, efficiently identifying misclassification risks.

2605.13830May 13, 2026S. Akshay, Chaitanya Garg, Ashutosh Gupta +2

Negation Neglect: When models fail to learn negations in training

LLMs finetuned on documents that flag claims as false often learn to believe those claims are true, a phenomenon called Negation Neglect.

2605.13829May 13, 2026Harry Mayne, Lev McKinney, Jan Dubiński +3

Reducing cross-sample prediction churn in scientific machine learning

This paper introduces "cross-sample prediction churn" in scientific ML and proposes data-side methods, including "twin-bootstrap," to significantly reduce it.

2605.13826May 13, 2026Gordan Prastalo, Kevin Maik Jablonka

Harnessing Agentic Evolution

AEvo introduces a meta-editing framework that steers agentic evolution by dynamically revising the evolution process, outperforming existing methods.

2605.13821May 13, 2026Jiayi Zhang, Yongfeng Gu, Jianhao Ruan +10

Uncertainty-Driven Anomaly Detection for Psychotic Relapse Using Smartwatches: Forecasting and Multi-Task Learning Fusion

This paper introduces a smartwatch-based system for early psychotic relapse detection, combining cardiac forecasting and multi-task learning with uncertainty estimation.

2605.13816May 13, 2026Nikolaos Tsalkitzis, Panagiotis P. Filntisis, Petros Maragos +1

Provable Quantization with Randomized Hadamard Transform

This paper introduces dithered quantization with randomized Hadamard transforms, offering provable, near-optimal MSE with high efficiency.

2605.13810May 13, 2026Ying Feng, Piotr Indyk, Michael Kapralov +2

Parallel Scan Recurrent Neural Quantum States for Scalable Variational Monte Carlo

Introducing Parallel Scan Recurrent Neural Quantum States (PSR-NQS), this work demonstrates how RNNs can efficiently simulate large quantum many-body systems.

2605.13807May 13, 2026Ejaaz Merali, Mohamed Hibat-Allah, Mohammad Kohandel +2

Min-Max Optimization Requires Exponentially Many Queries

Min-max optimization for nonconvex-nonconcave functions demands exponentially many queries to find an approximate stationary point.

2605.13806May 13, 2026Martino Bernasconi, Matteo Castiglioni, Andrea Celli +1

Improving Reproducibility in Evaluation through Multi-Level Annotator Modeling

This paper introduces a multi-level bootstrapping method to improve AI evaluation reproducibility by modeling annotator behavior and analyzing data tradeoffs.

2605.13801May 13, 2026Deepak Pandita, Flip Korn, Chris Welty +1

Di-BiLPS: Denoising induced Bidirectional Latent-PDE-Solver under Sparse Observations

Di-BiLPS is a neural framework that solves PDEs efficiently under extremely sparse data, outperforming SOTA with zero-shot super-resolution.

2605.13790May 13, 2026Zhonghao Li, Chaoyu Liu, Qian Zhang

ENSEMBITS: an alphabet of protein conformational ensembles

Ensembits is the first tokenizer for protein conformational ensembles, capturing dynamic motions and alternative states for protein language modeling.

2605.13789May 13, 2026Kaiwen Shi, Carlos Oliver

Force-Aware Neural Tangent Kernels for Scalable and Robust Active Learning of MLIPs

This paper introduces force-aware Neural Tangent Kernels and a scalable acquisition framework for robust active learning of MLIPs.

2605.13788May 13, 2026Eszter Varga-Umbrich, Zachary Weller-Davies, Paul Duckworth +3

Interpretable Machine Learning for Antepartum Prediction of Pregnancy-Associated Thrombotic Microangiopathy Using Routine Longitudinal Laboratory Data

Machine learning predicts pregnancy-associated thrombotic microangiopathy (P-TMA) antepartum using routine longitudinal lab data with high accuracy.

2605.13786May 13, 2026Chuanchuan Sun, Zhen Yu, Qin Fan +2

Attention Once Is All You Need: Efficient Streaming Inference with Stateful Transformers

This paper introduces stateful transformers for efficient streaming inference, reducing query latency to O(|q|) by moving prefill off the critical path.

2605.13784May 13, 2026Victor Norgren

MinT: Managed Infrastructure for Training and Serving Millions of LLMs

MinT is a managed infrastructure system for efficiently training and serving millions of LoRA-adapted LLMs over shared base models.

2605.13779May 13, 2026Mind Lab, :, Song Cao +60
Page 1 of 68Next

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.