Hao Wang

11 papers · Latest: May 12, 2026

StepCodeReasoner: Aligning Code Reasoning with Stepwise Execution Traces via Reinforcement Learning

StepCodeReasoner uses RL to align code reasoning with stepwise execution traces, achieving SOTA performance by supervising intermediate states.

2605.11922May 12, 2026

Computer Vision

BabelDOC: Better Layout-Preserving PDF Translation via Intermediate Representation

BabelDOC is an IR-based framework that accurately translates PDFs while preserving their original visual layout and improving terminology consistency.

2605.10845May 11, 2026

Robotics

VEGA: Visual Encoder Grounding Alignment for Spatially-Aware Vision-Language-Action Models

VEGA enhances VLA models' spatial reasoning by directly aligning their visual encoder outputs with 3D-aware features, improving robotic manipulation.

2605.10485May 11, 2026

Machine Learning

Understanding DNNs in Feature Interaction Models: A Dimensional Collapse Perspective

This paper shows DNNs in feature interaction models mitigate dimensional collapse, improving representation robustness and clarifying their role.

2604.26489Apr 29, 2026

Machine Learning

SOLAR-RL: Semi-Online Long-horizon Assignment Reinforcement Learning

SOLAR-RL bridges offline and online RL for MLLM GUI agents, using simulated online feedback to boost long-horizon task completion and robustness.

2604.22558Apr 24, 2026

Computer Vision

AIFIND: Artifact-Aware Interpreting Fine-Grained Alignment for Incremental Face Forgery Detection

AIFIND introduces artifact-aware semantic anchors and attention to stabilize incremental face forgery detection, preventing feature drift and catastrophic forgetting.

2604.16207Apr 17, 2026

Information Retrieval

Rethinking the Necessity of Adaptive Retrieval-Augmented Generation through the Lens of Adaptive Listwise Ranking

AdaRankLLM rethinks adaptive RAG, proposing a framework that optimizes retrieval for both weak and strong LLMs, significantly reducing context overhead.

2604.15621Apr 17, 2026

Natural Language Processing

Compressing Sequences in the Latent Embedding Space: $K$-Token Merging for Large Language Models

K-Token Merging compresses LLM inputs in the latent embedding space, reducing sequence length by up to 75% with minimal performance loss.

2604.15153Apr 16, 2026

Stochastic Trust-Region Methods for Over-parameterized Models

This paper introduces a stochastic trust-region framework for over-parameterized models, eliminating manual step-size tuning and handling constrained problems.

2604.14017Apr 15, 2026

Computer Vision

Visual Preference Optimization with Rubric Rewards

rDPO introduces rubric-based preference optimization for visual tasks, using instance-specific checklists to generate high-quality feedback.

2604.13029Apr 14, 2026

Robotics

XRZero-G0: Pushing the Frontier of Dexterous Robotic Manipulation with Interfaces, Quality and Ratios

XRZero-G0 is a hardware-software system that enables scalable, high-quality robot-free data collection for dexterous manipulation, reducing costs significantly.

2604.13001Apr 14, 2026

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.