Robotics

Research on robot control, manipulation, navigation, and human-robot interaction.

cs.RO · 524 papers

SafeManip: A Property-Driven Benchmark for Temporal Safety Evaluation in Robotic Manipulation

SafeManip is a new benchmark using LTLf to evaluate temporal safety in robotic manipulation, revealing strong models often behave unsafely.

2605.12386May 12, 2026Chengyue Huang, Khang Vo Huynh, Sebastian Elbaum +2

GuidedVLA: Specifying Task-Relevant Factors via Plug-and-Play Action Attention Specialization

GuidedVLA enhances VLA models by specializing action attention heads with auxiliary signals to focus on task-relevant factors, improving generalization and robustness.

2605.12369May 12, 2026Xiaosong Jia, Bowen Yang, Zuhao Ge +17

Real-Time Whole-Body Teleoperation of a Humanoid Robot Using IMU-Based Motion Capture with Sim2Sim and Sim2Real Validation

This paper presents a real-time whole-body teleoperation system for humanoid robots using IMU motion capture, validated in sim and real.

2605.12347May 12, 2026Hamza Ahmed Durrani, Suleman Khan

EgoEV-HandPose: Egocentric 3D Hand Pose Estimation and Gesture Recognition with Stereo Event Cameras

EgoEV-HandPose uses stereo event cameras and a new dataset for robust egocentric 3D hand pose estimation and gesture recognition, outperforming RGB.

2605.12297May 12, 2026Luming Wang, Hao Shi, Jiajun Zhai +2

SI-Diff: A Framework for Learning Search and High-Precision Insertion with a Force-Domain Diffusion Policy

SI-Diff uses a force-domain diffusion policy with mode-conditioning to learn both robotic search and high-precision insertion tasks in a single framework.

2605.12247May 12, 2026Yibo Liu, Stanko Oparnica, Simon Shewchun-Jakaitis +5

TMRL: Diffusion Timestep-Modulated Pretraining Enables Exploration for Efficient Policy Finetuning

TMRL introduces diffusion timestep-modulated pretraining to enable efficient exploration and finetuning of robot policies, improving sample efficiency.

2605.12236May 12, 2026Matthew M. Hong, Jesse Zhang, Anusha Nagabandi +1

Morphologically Equivariant Flow Matching for Bimanual Mobile Manipulation

This paper introduces a morphologically equivariant flow matching policy that leverages bilateral symmetry for improved bimanual mobile manipulation.

2605.12228May 12, 2026Max Siebenborn, Daniel Ordoñez Apraez, Sophie Lueth +4

TriBand-BEV: Real-Time LiDAR-Only 3D Pedestrian Detection via Height-Aware BEV and High-Resolution Feature Fusion

TriBand-BEV introduces a real-time LiDAR-only 3D pedestrian detection method using a height-aware BEV encoding, outperforming prior methods on KITTI.

2605.12220May 12, 2026Mohammad Khoshkdahan, Alexey Vinel

DexTwist: Dexterous Hand Retargeting for Twist Motion via Mixed Reality-based Teleoperation

DexTwist is a mixed reality-based teleoperation framework that improves dexterous robot hand performance for contact-rich twist motions.

2605.12182May 12, 2026Dongmyoung Lee, Chengxi Li, Dongheui Lee

From Imagined Futures to Executable Actions: Mixture of Latent Actions for Robot Manipulation

MoLA transforms imagined robot manipulation videos into executable actions by inferring a mixture of latent actions via inverse dynamics models.

2605.12167May 12, 2026Yajie Li, Bozhou Zhang, Chun Gu +5

X-Imitator: Spatial-Aware Imitation Learning via Bidirectional Action-Pose Interaction

X-Imitator introduces a bidirectional framework for robotic manipulation, tightly coupling spatial perception and action generation for improved performance.

2605.12162May 12, 2026Kai Xiong, Hongjie Fang, Lixin Yang +1

Premover: Fast Vision-Language-Action Control by Acting Before Instructions Are Complete

Premover speeds up Vision-Language-Action policies by enabling robots to start acting before user instructions are fully complete, reducing idle time.

2605.12160May 12, 2026Joonha Park, Jiseung Jeong, Taesik Gong

World Action Models: The Next Frontier in Embodied AI

This survey introduces World Action Models (WAMs), a new embodied AI paradigm unifying predictive state modeling with action generation, providing a systematic overview.

2605.12090May 12, 2026Siyin Wang, Junhao Shi, Zhaoyang Fu +11

Learning What Matters: Adaptive Information-Theoretic Objectives for Robot Exploration

QOED improves robot exploration by adaptively identifying and prioritizing observable parameter directions, suppressing nuisance effects for better learning.

2605.12084May 12, 2026Youwei Yu, Jionghao Wang, Zhengming Yu +2

Control of Fully Actuated Aerial Vehicles: A Comparison of Model-based and Sensor-based Dynamic Inversion

This paper compares model-based (geometric NDI) and sensor-based (INDI) dynamic inversion for fully actuated aerial vehicles, finding INDI more robust.

2605.12071May 12, 2026Ali Sidar Yilmaz, Buday Turan, Lukas Pries +1

RoboBlockly Studio: Conversational Block Programming with Embodied Robot Feedback for Computational Thinking

RoboBlockly Studio combines block programming, conversational AI, and embodied robots to improve computational thinking education.

2605.12059May 12, 2026Leyi Li, Chenyu Du, Jiafei Sun +2

Closing the Motion Execution Gap: From Semantic Motion Task Constraints to Kinematic Control

This paper closes the Motion Execution Gap, translating high-level semantic task constraints into executable robot motions via Motion Statecharts.

2605.12053May 12, 2026Simon Stelter, Vanessa Hassouna, Malte Huerkamp +1

Cooperative Robotics Reinforced by Collective Perception for Traffic Moderation

This paper introduces a cooperative humanoid robot that uses collective perception and V2X to moderate traffic and prevent collisions at non-line-of-sight intersections.

2605.11972May 12, 2026Mohammad Khoshkdahan, John Pravin Arockiasamy, Andy Flores Comeca +1

From Reaction to Anticipation: Proactive Failure Recovery through Agentic Task Graph for Robotic Manipulation

AgentChord enables proactive robotic failure recovery using an agentic task graph with anticipatory branches, improving manipulation success and efficiency.

2605.11951May 12, 2026Sheng Xu, Ruixing Jin, Huayi Zhou +6

EvoNav: Evolutionary Reward Function Design for Robot Navigation with Large Language Models

EvoNav uses LLMs and an efficient three-stage evolutionary framework to automatically design superior reward functions for robot navigation.

2605.11859May 12, 2026Zhikai Zhao, Chuanbo Hua, Federico Berto +4

PreviousPage 2 of 27Next

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.