RobotEQ: Transitioning from Passive Intelligence to Active Intelligence in Embodied AI

May 7, 20262605.06234

Kuofei Fang, Xinyi Che, Haomin Ouyang, Shufan Zhang, Xuehao Wang + 10 more

cs.ROcs.HC

TLDR

RobotEQ introduces the first benchmark for active intelligence, assessing if embodied AI can understand and adhere to social norms without explicit commands.

Key contributions

Introduces RobotEQ, the first benchmark to evaluate active intelligence in embodied AI for social norm compliance.
Constructs RobotEQ-Data, a dataset with 1,900 egocentric images and over 6,600 action judgment and spatial grounding questions.
Establishes RobotEQ-Bench to evaluate state-of-the-art models, revealing current limitations in active intelligence, especially spatial grounding.
Shows that RAG techniques incorporating external social norm knowledge can improve model performance.

Why it matters

This paper addresses a critical gap in embodied AI by shifting focus from user-guided tasks to autonomous social compliance. RobotEQ provides a foundational benchmark and dataset to drive research in active intelligence, enabling robots to integrate more seamlessly and safely into human environments. It highlights current model shortcomings and promising avenues for future development.

Original Abstract

Embodied AI is a prominent research topic in both academia and industry. Current research centers on completing tasks based on explicit user instructions. However, for robots to integrate into human society, they must understand which actions are permissible and which are prohibited, even without explicit commands. We refer to the user-guided AI as passive intelligence and the unguided AI as active intelligence. This paper introduces RobotEQ, the first benchmark for active intelligence, aiming to assess whether existing models can comprehend and adhere to social norms in embodied scenarios. First, we construct RobotEQ-Data, a dataset consisting of 1,900 egocentric images, spanning 10 representative embodied categories and 56 subcategories. Through extensive manual annotation, we provide 5,353 action judgment questions and 1,286 spatial grounding questions, specifying appropriate robot actions across diverse scenarios. Furthermore, we establish RobotEQ-Bench to evaluate the performance of state-of-the-art models on this task. Experimental results show that current models still fall short in achieving reliable active intelligence, particularly in spatial grounding. Meanwhile, we observe that leveraging RAG techniques to incorporate external social norm knowledge bases can generally enhance performance. This work can facilitate the transition of robotics from user-guided passive manipulation to active social compliance.

View on arXiv Download PDF

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.

TLDR

Key contributions

Why it matters

Original Abstract

📬 Weekly AI Paper Digest

Related papers