ArXiv TLDR

Hao Wang

11 papers ยท Latest:

Software Engineering

StepCodeReasoner: Aligning Code Reasoning with Stepwise Execution Traces via Reinforcement Learning

StepCodeReasoner uses RL to align code reasoning with stepwise execution traces, achieving SOTA performance by supervising intermediate states.

2605.11922
Computer Vision

BabelDOC: Better Layout-Preserving PDF Translation via Intermediate Representation

BabelDOC is an IR-based framework that accurately translates PDFs while preserving their original visual layout and improving terminology consistency.

2605.10845
Robotics

VEGA: Visual Encoder Grounding Alignment for Spatially-Aware Vision-Language-Action Models

VEGA enhances VLA models' spatial reasoning by directly aligning their visual encoder outputs with 3D-aware features, improving robotic manipulation.

2605.10485
Machine Learning

Understanding DNNs in Feature Interaction Models: A Dimensional Collapse Perspective

This paper shows DNNs in feature interaction models mitigate dimensional collapse, improving representation robustness and clarifying their role.

2604.26489
Machine Learning

SOLAR-RL: Semi-Online Long-horizon Assignment Reinforcement Learning

SOLAR-RL bridges offline and online RL for MLLM GUI agents, using simulated online feedback to boost long-horizon task completion and robustness.

2604.22558
Computer Vision

AIFIND: Artifact-Aware Interpreting Fine-Grained Alignment for Incremental Face Forgery Detection

AIFIND introduces artifact-aware semantic anchors and attention to stabilize incremental face forgery detection, preventing feature drift and catastrophic forgetting.

2604.16207
Information Retrieval

Rethinking the Necessity of Adaptive Retrieval-Augmented Generation through the Lens of Adaptive Listwise Ranking

AdaRankLLM rethinks adaptive RAG, proposing a framework that optimizes retrieval for both weak and strong LLMs, significantly reducing context overhead.

2604.15621
Natural Language Processing

Compressing Sequences in the Latent Embedding Space: $K$-Token Merging for Large Language Models

K-Token Merging compresses LLM inputs in the latent embedding space, reducing sequence length by up to 75% with minimal performance loss.

2604.15153

Stochastic Trust-Region Methods for Over-parameterized Models

This paper introduces a stochastic trust-region framework for over-parameterized models, eliminating manual step-size tuning and handling constrained problems.

2604.14017
Computer Vision

Visual Preference Optimization with Rubric Rewards

rDPO introduces rubric-based preference optimization for visual tasks, using instance-specific checklists to generate high-quality feedback.

2604.13029
Robotics

XRZero-G0: Pushing the Frontier of Dexterous Robotic Manipulation with Interfaces, Quality and Ratios

XRZero-G0 is a hardware-software system that enables scalable, high-quality robot-free data collection for dexterous manipulation, reducing costs significantly.

2604.13001

๐Ÿ“ฌ Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week โ€” summarized, scored, and delivered to your inbox every Monday.