Xinghang Li
2 papers ยท Latest:
Robotics
Unified 4D World Action Modeling from Video Priors with Asynchronous Denoising
X-WAM is a unified 4D world model that combines real-time robotic action with high-fidelity 4D synthesis using video priors and asynchronous denoising.
2604.26694
RoboticsMulti-View Video Diffusion Policy: A 3D Spatio-Temporal-Aware Video Action Model
MV-VDP is a multi-view video diffusion policy that jointly models 3D spatio-temporal states for data-efficient, robust robotic manipulation.
2604.03181
๐ฌ Weekly AI Paper Digest
Get the top 10 AI/ML arXiv papers from the week โ summarized, scored, and delivered to your inbox every Monday.