Robotics
Research on robot control, manipulation, navigation, and human-robot interaction.
cs.RO · 524 papersSafeManip: A Property-Driven Benchmark for Temporal Safety Evaluation in Robotic Manipulation
SafeManip is a new benchmark using LTLf to evaluate temporal safety in robotic manipulation, revealing strong models often behave unsafely.
GuidedVLA: Specifying Task-Relevant Factors via Plug-and-Play Action Attention Specialization
GuidedVLA enhances VLA models by specializing action attention heads with auxiliary signals to focus on task-relevant factors, improving generalization and robustness.
Real-Time Whole-Body Teleoperation of a Humanoid Robot Using IMU-Based Motion Capture with Sim2Sim and Sim2Real Validation
This paper presents a real-time whole-body teleoperation system for humanoid robots using IMU motion capture, validated in sim and real.
EgoEV-HandPose: Egocentric 3D Hand Pose Estimation and Gesture Recognition with Stereo Event Cameras
EgoEV-HandPose uses stereo event cameras and a new dataset for robust egocentric 3D hand pose estimation and gesture recognition, outperforming RGB.
SI-Diff: A Framework for Learning Search and High-Precision Insertion with a Force-Domain Diffusion Policy
SI-Diff uses a force-domain diffusion policy with mode-conditioning to learn both robotic search and high-precision insertion tasks in a single framework.
TMRL: Diffusion Timestep-Modulated Pretraining Enables Exploration for Efficient Policy Finetuning
TMRL introduces diffusion timestep-modulated pretraining to enable efficient exploration and finetuning of robot policies, improving sample efficiency.
Morphologically Equivariant Flow Matching for Bimanual Mobile Manipulation
This paper introduces a morphologically equivariant flow matching policy that leverages bilateral symmetry for improved bimanual mobile manipulation.
TriBand-BEV: Real-Time LiDAR-Only 3D Pedestrian Detection via Height-Aware BEV and High-Resolution Feature Fusion
TriBand-BEV introduces a real-time LiDAR-only 3D pedestrian detection method using a height-aware BEV encoding, outperforming prior methods on KITTI.
DexTwist: Dexterous Hand Retargeting for Twist Motion via Mixed Reality-based Teleoperation
DexTwist is a mixed reality-based teleoperation framework that improves dexterous robot hand performance for contact-rich twist motions.
From Imagined Futures to Executable Actions: Mixture of Latent Actions for Robot Manipulation
MoLA transforms imagined robot manipulation videos into executable actions by inferring a mixture of latent actions via inverse dynamics models.
X-Imitator: Spatial-Aware Imitation Learning via Bidirectional Action-Pose Interaction
X-Imitator introduces a bidirectional framework for robotic manipulation, tightly coupling spatial perception and action generation for improved performance.
Premover: Fast Vision-Language-Action Control by Acting Before Instructions Are Complete
Premover speeds up Vision-Language-Action policies by enabling robots to start acting before user instructions are fully complete, reducing idle time.
World Action Models: The Next Frontier in Embodied AI
This survey introduces World Action Models (WAMs), a new embodied AI paradigm unifying predictive state modeling with action generation, providing a systematic overview.
Learning What Matters: Adaptive Information-Theoretic Objectives for Robot Exploration
QOED improves robot exploration by adaptively identifying and prioritizing observable parameter directions, suppressing nuisance effects for better learning.
Control of Fully Actuated Aerial Vehicles: A Comparison of Model-based and Sensor-based Dynamic Inversion
This paper compares model-based (geometric NDI) and sensor-based (INDI) dynamic inversion for fully actuated aerial vehicles, finding INDI more robust.
RoboBlockly Studio: Conversational Block Programming with Embodied Robot Feedback for Computational Thinking
RoboBlockly Studio combines block programming, conversational AI, and embodied robots to improve computational thinking education.
Closing the Motion Execution Gap: From Semantic Motion Task Constraints to Kinematic Control
This paper closes the Motion Execution Gap, translating high-level semantic task constraints into executable robot motions via Motion Statecharts.
Cooperative Robotics Reinforced by Collective Perception for Traffic Moderation
This paper introduces a cooperative humanoid robot that uses collective perception and V2X to moderate traffic and prevent collisions at non-line-of-sight intersections.
From Reaction to Anticipation: Proactive Failure Recovery through Agentic Task Graph for Robotic Manipulation
AgentChord enables proactive robotic failure recovery using an agentic task graph with anticipatory branches, improving manipulation success and efficiency.
EvoNav: Evolutionary Reward Function Design for Robot Navigation with Large Language Models
EvoNav uses LLMs and an efficient three-stage evolutionary framework to automatically design superior reward functions for robot navigation.
📬 Weekly AI Paper Digest
Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.