KinDER: A Physical Reasoning Benchmark for Robot Learning and Planning
Yixuan Huang, Bowen Li, Vaibhav Saxena, Yichao Liang, Utkarsh Aashu Mishra + 7 more
TLDR
KinDER is a new benchmark for robot learning and planning, featuring 25 environments to test kinematic and dynamic physical reasoning challenges.
Key contributions
- Introduces KinDER, a benchmark with 25 procedurally generated environments for robot physical reasoning.
- Isolates five core physical reasoning challenges, disentangled from perception and language.
- Includes a Python library, parameterized skills, demonstrations, and 13 baselines for evaluation.
- Empirical evaluation shows existing methods struggle, highlighting gaps in current robot physical reasoning.
Why it matters
KinDER provides a much-needed standardized benchmark to systematically evaluate and advance physical reasoning in robotics. Its focus on disentangled challenges and comprehensive evaluation suite will drive progress. The empirical results reveal significant limitations in current robot learning and planning methods.
Original Abstract
Robotic systems that interact with the physical world must reason about kinematic and dynamic constraints imposed by their own embodiment, their environment, and the task at hand. We introduce KinDER, a benchmark for Kinematic and Dynamic Embodied Reasoning that targets physical reasoning challenges arising in robot learning and planning. KinDER comprises 25 procedurally generated environments, a Gymnasium-compatible Python library with parameterized skills and demonstrations, and a standardized evaluation suite with 13 implemented baselines spanning task and motion planning, imitation learning, reinforcement learning, and foundation-model-based approaches. The environments are designed to isolate five core physical reasoning challenges: basic spatial relations, nonprehensile multi-object manipulation, tool use, combinatorial geometric constraints, and dynamic constraints, disentangled from perception, language understanding, and application-specific complexity. Empirical evaluation shows that existing methods struggle to solve many of the environments, indicating substantial gaps in current approaches to physical reasoning. We additionally include real-to-sim-to-real experiments on a mobile manipulator to assess the correspondence between simulation and real-world physical interaction. KinDER is fully open-sourced and intended to enable systematic comparison across diverse paradigms for advancing physical reasoning in robotics. Website and code: https://prpl-group.com/kinder-site/
📬 Weekly AI Paper Digest
Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.