Machine Learning
Papers on learning algorithms, neural networks, deep learning, and optimization.
cs.LG · 1353 papersPredicting Decisions of AI Agents from Limited Interaction through Text-Tabular Modeling
A new text-tabular model, using an "LLM-as-Observer," accurately predicts unfamiliar AI agent decisions in negotiation games from limited interactions.
Model-based Bootstrap of Controlled Markov Chains
This paper proposes a model-based bootstrap for controlled Markov chains in offline RL, yielding consistent estimators and valid confidence intervals.
OGLS-SD: On-Policy Self-Distillation with Outcome-Guided Logit Steering for LLM Reasoning
OGLS-SD enhances LLM reasoning by using outcome-guided logit steering to correct teacher-student mismatches in on-policy self-distillation.
Detecting overfitting in Neural Networks during long-horizon grokking using Random Matrix Theory
A new Random Matrix Theory method detects overfitting in neural networks, even in large LLMs, by identifying "Correlation Traps" in weight matrices.
Trajectory-Agnostic Asteroid Detection in TESS with Deep Learning
This paper introduces a deep learning W-Net method for trajectory-agnostic asteroid detection in TESS data, robust to varying speeds and directions.
SEMIR: Semantic Minor-Induced Representation Learning on Graphs for Visual Segmentation
SEMIR is a graph-based representation learning framework for visual segmentation that efficiently handles small, sparse structures by decoupling inference from the image grid.
Scalable Token-Level Hallucination Detection in Large Language Models
TokenHD is a scalable pipeline for training token-level hallucination detectors in LLMs, outperforming larger models in detecting reasoning errors.
Fill the GAP: A Granular Alignment Paradigm for Visual Reasoning in Multimodal Large Language Models
GAP proposes a granular alignment paradigm to stabilize visual latent reasoning in MLLMs by addressing feature-space mismatches, improving performance.
Attacks and Mitigations for Distributed Governance of Agentic AI under Byzantine Adversaries
This paper analyzes attacks on agentic AI governance from compromised centralized providers and proposes Byzantine-resilient, monitoring, and auditing solutions.
Multi-Variable Conformal Prediction: Optimizing Prediction Sets without Data Splitting
Multi-Variable Conformal Prediction (MCP) optimizes prediction sets by extending conformal prediction to vector-valued scores, eliminating data splitting.
EHR-RAGp: Retrieval-Augmented Prototype-Guided Foundation Model for Electronic Health Records
EHR-RAGp is a retrieval-augmented foundation model for EHRs, dynamically integrating relevant patient history via a prototype-guided module for better clinical predictions.
From Model Uncertainty to Human Attention: Localization-Aware Visual Cues for Scalable Annotation Review
This paper introduces visual cues for spatial uncertainty in AI-assisted annotation, improving label quality and speed by guiding human attention.
Reconstruction of Personally Identifiable Information from Supervised Finetuned Models
This paper reveals that PII can be reconstructed from supervised finetuned LLMs, proposing COVA to enhance reconstruction under prefix attacks.
TMRL: Diffusion Timestep-Modulated Pretraining Enables Exploration for Efficient Policy Finetuning
TMRL introduces diffusion timestep-modulated pretraining to enable efficient exploration and finetuning of robot policies, improving sample efficiency.
Optimal Policy Learning under Budget and Coverage Constraints
This paper characterizes optimal policy learning under budget and coverage constraints, showing a knapsack structure and near-optimal algorithms.
No More, No Less: Task Alignment in Terminal Agents
A new benchmark, TAB, reveals terminal agents struggle with selectively following relevant instructions while ignoring distractors, highlighting a gap in task alignment.
TriBand-BEV: Real-Time LiDAR-Only 3D Pedestrian Detection via Height-Aware BEV and High-Resolution Feature Fusion
TriBand-BEV introduces a real-time LiDAR-only 3D pedestrian detection method using a height-aware BEV encoding, outperforming prior methods on KITTI.
Self-Supervised Laplace Approximation for Bayesian Uncertainty Quantification
Introduces Self-Supervised Laplace Approximation (SSLA) to quantify Bayesian model predictive uncertainty by refitting on self-predicted data, outperforming classical methods.
PrivacySIM: Evaluating LLM Simulation of User Privacy Behavior
PrivacySIM evaluates LLMs' ability to simulate individual privacy decisions, finding persona conditioning improves accuracy but models still struggle.
Learning What Matters: Adaptive Information-Theoretic Objectives for Robot Exploration
QOED improves robot exploration by adaptively identifying and prioritizing observable parameter directions, suppressing nuisance effects for better learning.
📬 Weekly AI Paper Digest
Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.