ArXiv TLDR

RPG: Robust Policy Gating for Smooth Multi-Skill Transitions in Humanoid Fighting

🐦 Tweet
2604.21355

Yucheng Xin, Jiacheng Bao, Yubo Dong, Xueqian Wang, Bin Zhao + 3 more

cs.RO

TLDR

RPG enables humanoids to perform smooth, stable multi-skill fighting transitions by training a unified policy with randomization.

Key contributions

  • Introduces RPG, a hybrid policy framework for smooth multi-skill transitions in humanoid fighting.
  • Employs motion and temporal randomization to train a unified, stable fighting policy.
  • Integrates locomotion with fighting skills for continuous, interruptible combat.
  • Demonstrated effectiveness in simulation and on a real Unitree G1 humanoid robot.

Why it matters

Current humanoid fighting struggles with unstable skill transitions. RPG offers a robust solution for smooth, continuous multi-skill combat, integrating locomotion. This significantly advances humanoid control for complex, dynamic, and long-duration tasks.

Original Abstract

Humanoid robots have demonstrated impressive motor skills in a wide range of tasks, yet whole-body control for humanlike long-time, dynamic fighting remains particularly challenging due to the stringent requirements on agility and stability. While imitation learning enables robots to execute human-like fighting skills, existing approaches often rely on switching among multiple single-skill policies or employing a general policy to imitate input reference motions. These strategies suffer from instability when transitioning between skills, as the mismatch of initial and terminal states across skills or reference motions introduces out-of-domain disturbances, resulting in unsmooth or unstable behaviors. In this work, we propose RPG, a hybrid expert policy framework, for smooth and stable humanoid multi-skills transition. Our approach incorporates motion transition randomization and temporal randomization to train a unified policy that generates agile fighting actions with stability and smoothness during skill transitions. Furthermore, we design a control pipeline that integrates walking/running locomotion with fighting skills, allowing humanlike long-time combat of arbitrary duration that can be seamlessly interrupted or transit action policies at any time. Extensive experiments in simulation demonstrate the effectiveness of the proposed framework, and real-world deployment on the Unitree G1 humanoid robot further validates its robustness and applicability.

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.