Shanghang Zhang

5 papers · Latest: May 11, 2026

HarmoWAM: Harmonizing Generalizable and Precise Manipulation via Adaptive World Action Models

HarmoWAM unifies predictive and reactive control in robot manipulation, achieving both generalizable transit and precise interaction through adaptive expert coordination.

2605.10942May 11, 2026

Robotics

VEGA: Visual Encoder Grounding Alignment for Spatially-Aware Vision-Language-Action Models

VEGA enhances VLA models' spatial reasoning by directly aligning their visual encoder outputs with 3D-aware features, improving robotic manipulation.

2605.10485May 11, 2026

Robotics

LaST-R1: Reinforcing Action via Adaptive Physical Latent Reasoning for VLA Models

LaST-R1 enhances VLA models with adaptive latent physical reasoning and a new RL algorithm, LAPO, achieving near-perfect robotic manipulation.

2604.28192Apr 30, 2026

Robotics

Hi-WM: Human-in-the-World-Model for Scalable Robot Post-Training

Hi-WM enables scalable robot post-training by allowing human intervention directly within a learned world model, reducing real-world execution needs.

2604.21741Apr 23, 2026

Robotics

Mask World Model: Predicting What Matters for Robust Robot Policy Learning

Mask World Model predicts semantic masks instead of pixels for robust robot policy learning, outperforming RGB-based world models.

2604.19683Apr 21, 2026

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.