Machine Learning
Papers on learning algorithms, neural networks, deep learning, and optimization.
cs.LG ยท 1353 papersQuantifying Concentration Phenomena of Mean-Field Transformers in the Low-Temperature Regime
This paper quantifies how token distributions in mean-field transformers rapidly concentrate in the low-temperature regime, remaining metastable.
Dynamic Skill Lifecycle Management for Agentic Reinforcement Learning
SLIM dynamically manages external skills for LLM agents in RL, optimizing their active skill set for improved task performance.
Equivariant Reinforcement Learning for Clifford Quantum Circuit Synthesis
Introduces an equivariant RL agent for efficient, scalable Clifford quantum circuit synthesis across varying qubit counts.
DataMaster: Towards Autonomous Data Engineering for Machine Learning
DataMaster automates data engineering for ML, using a novel agent framework with tree search, shared data, and memory to boost model performance.
Beyond Red-Teaming: Formal Guarantees of LLM Guardrail Classifiers
This paper introduces a novel method to formally verify LLM guardrail classifiers by analyzing their pre-activation space, revealing hidden safety vulnerabilities.
RubricEM: Meta-RL with Rubric-guided Policy Decomposition beyond Verifiable Rewards
RubricEM is a meta-RL framework that uses rubrics to guide policy decomposition and reflection for training research agents without verifiable rewards.
V4FinBench: Benchmarking Tabular Foundation Models, LLMs, and Standard Methods on Corporate Bankruptcy Prediction
V4FinBench introduces a new large-scale dataset for corporate bankruptcy prediction, benchmarking tabular foundation models and LLMs against standard methods.
Unmasking On-Policy Distillation: Where It Helps, Where It Hurts, and Why
This paper introduces a diagnostic framework to analyze on-policy distillation, revealing it helps more on incorrect rollouts and that optimal context varies.
LoKA: Low-precision Kernel Applications for Recommendation Models At Scale
LoKA introduces a system-model co-design framework to make FP8 low-precision arithmetic practical and efficient for large recommendation models.
AssayBench: An Assay-Level Virtual Cell Benchmark for LLMs and Agents
AssayBench is a new benchmark for phenotypic screen prediction in virtual cell models, evaluating LLMs and agents on diverse cellular phenotypes.
Compute Where it Counts: Self Optimizing Language Models
Self-Optimizing Language Models (SOL) dynamically allocate computation per token, improving LLM inference efficiency and quality over static methods.
BEACON: A Multimodal Dataset for Learning Behavioral Fingerprints from Gameplay Data
BEACON is a large, multimodal dataset from competitive Valorant gameplay for continuous authentication and behavioral fingerprinting research.
Masked Generative Transformer Is What You Need for Image Editing
EditMGT, a novel Masked Generative Transformer, offers faster, more precise image editing by localizing changes, outperforming diffusion models.
The Generalized Turing Test: A Foundation for Comparing Intelligence
The Generalized Turing Test (GTT) offers a formal, dataset-agnostic framework to compare AI agent intelligence via indistinguishability.
Clin-JEPA: A Multi-Phase Co-Training Framework for Joint-Embedding Predictive Pretraining on EHR Patient Trajectories
Clin-JEPA is a multi-phase co-training framework for JEPA pretraining on EHR patient trajectories, enabling accurate forecasting and risk prediction.
Transcoda: End-to-End Zero-Shot Optical Music Recognition via Data-Centric Synthetic Training
Transcoda is a zero-shot OMR system using advanced synthetic data, normalized encodings, and grammar-based decoding to achieve state-of-the-art performance.
Predicting 3D structure by latent posterior sampling
This paper introduces a method for 3D structure prediction by combining NeRFs with diffusion models for probabilistic latent posterior sampling.
SLIM: Sparse Latent Steering for Interpretable and Property-Directed LLM-Based Molecular Editing
SLIM enhances LLM molecular editing by using sparse latent steering to precisely control properties and improve success rates.
LLMs for Secure Hardware Design and Related Problems: Opportunities and Challenges
A review of LLMs in hardware design, covering their capabilities, introduced vulnerabilities, and essential security countermeasures.
The Last Word Often Wins: A Format Confound in Chain-of-Thought Corruption Studies
Chain-of-thought corruption studies are confounded by explicit answer formats; models often follow the final answer text, not the reasoning.
๐ฌ Weekly AI Paper Digest
Get the top 10 AI/ML arXiv papers from the week โ summarized, scored, and delivered to your inbox every Monday.