Wenbo Ding
5 papers ยท Latest:
OA-WAM: Object-Addressable World Action Model for Robust Robot Manipulation
OA-WAM introduces an object-addressable world action model that decomposes scenes into persistent object slots for robust robot manipulation under scene shifts.
Thinking in Text and Images: Interleaved Vision--Language Reasoning Traces for Long-Horizon Robot Manipulation
IVLR introduces an interleaved vision-language reasoning trace for long-horizon robot manipulation, achieving high success on complex tasks.
Walk With Me: Long-Horizon Social Navigation for Human-Centric Outdoor Assistance
Walk with Me is a map-free framework enabling robots to perform safe, long-horizon social navigation outdoors using high-level human instructions.
Learning Human-Intention Priors from Large-Scale Human Demonstrations for Robotic Manipulation
MoT-HRA learns human-intention priors from 2.2M human video demonstrations to enable robust robotic manipulation through a hierarchical vision-language-action framework.
Agent-Centric Visual Reinforcement Learning under Dynamic Perturbations
ACO-MoE robustifies visual RL against dynamic perturbations by using agent-centric restoration experts, achieving near clean performance on a new benchmark.
๐ฌ Weekly AI Paper Digest
Get the top 10 AI/ML arXiv papers from the week โ summarized, scored, and delivered to your inbox every Monday.