CUBic: Coordinated Unified Bimanual Perception and Control Framework

May 13, 20262605.13452

Xingyu Wang, Pengxiang Ding, Jingkai Xu, Donglin Wang, Zhaoxin Fan

cs.ROcs.AI

TLDR

CUBic is a novel framework for bimanual robot control that unifies perception and coordination, outperforming state-of-the-art visuomotor baselines.

Key contributions

CUBic unifies bimanual perception and control through a novel perceptual modeling approach.
Learns a shared tokenized representation for intrinsic arm independence and coordination.
Integrates unidirectional aggregation, bidirectional coordination, and a diffusion policy.
Significantly improves bimanual coordination accuracy and task success on RoboTwin.

Why it matters

Bimanual robot manipulation is challenging, needing independent perception and coordinated interaction, which existing methods struggle to unify. CUBic's novel framework intrinsically handles both, boosting task success and advancing robotics.

Original Abstract

Recent advances in visuomotor policy learning have enabled robots to perform control directly from visual inputs. Yet, extending such end-to-end learning from single-arm to bimanual manipulation remains challenging due to the need for both independent perception and coordinated interaction between arms. Existing methods typically favor one side -- either decoupling the two arms to avoid interference or enforcing strong cross-arm coupling for coordination -- thus lacking a unified treatment. We propose CUBic, a Coordinated and Unified framework for Bimanual perception and control that reformulates bimanual coordination as a unified perceptual modeling problem. CUBic learns a shared tokenized representation bridging perception and control, where independence and coordination emerge intrinsically from structure rather than from hand-crafted coupling. Our approach integrates three components: unidirectional perception aggregation, bidirectional perception coordination through two codebooks with shared mapping, and a unified perception-to-control diffusion policy. Extensive experiments on the RoboTwin benchmark show that CUBic consistently surpasses standard baselines, achieving marked improvements in coordination accuracy and task success rates over state-of-the-art visuomotor baselines.

View on arXiv Download PDF

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.

TLDR

Key contributions

Why it matters

Original Abstract

📬 Weekly AI Paper Digest

Related papers