RoboEvolve: Co-Evolving Planner-Simulator for Robotic Manipulation with Limited Data

May 13, 20262605.13775

Harold Haodong Chen, Sirui Chen, Yingjie Xu, Wenhang Ge, Ying-Cong Chen

cs.ROcs.CV

TLDR

RoboEvolve co-evolves a VLM planner and VGM simulator to overcome data scarcity in robotic manipulation, achieving high efficiency with limited unlabeled data.

Key contributions

Co-evolves VLM planner & VGM simulator to generate physically grounded data for robotic manipulation.
Uses a dual-phase mechanism (daytime exploration, nighttime consolidation) for robust policy optimization.
Achieves 50x data efficiency, outperforming supervised methods with only 500 unlabeled seed images.
Boosts planner effectiveness by 30% and simulator success by 48%, showing robust continual learning.

Why it matters

Robotic manipulation is bottlenecked by scarce physical interaction data, and current VLM/VGM solutions have issues. RoboEvolve tackles this by co-evolving a planner and simulator, generating high-quality data. This drastically improves data efficiency and performance, making robotic manipulation more scalable.

Original Abstract

The scalability of robotic manipulation is fundamentally bottlenecked by the scarcity of task-aligned physical interaction data. While vision-language models (VLMs) and video generation models (VGMs) hold promise for autonomous data synthesis, they suffer from semantic-spatial misalignment and physical hallucinations, respectively. To bridge this gap, we introduce RoboEvolve, a novel framework that couples a VLM planner and a VGM simulator into a mutually reinforcing co-evolutionary loop. Operating purely on unlabeled seed images, RoboEvolve leverages a cognitive-inspired dual-phase mechanism: (i) daytime exploration fosters physically grounded behavioral discovery through a semantic-controlled multi-granular reward, and (ii) nighttime consolidation mines "near-miss" failures to stabilize policy optimization. Guided by an autonomous progressive curriculum, the system naturally scales from simple atomic actions to complex tasks. Extensive experiments demonstrate that RoboEvolve (I) achieves superior effectiveness, elevating base planners by 30 absolute points and amplifying simulator success by 48% on average; (II) exhibits extreme data efficiency, surpassing fully supervised baselines with merely 500 unlabeled seeds--a 50x reduction; and (III) demonstrates robust continual learning without catastrophic forgetting.

View on arXiv Download PDF

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.

TLDR

Key contributions

Why it matters

Original Abstract

📬 Weekly AI Paper Digest

Related papers