Robotics
Research on robot control, manipulation, navigation, and human-robot interaction.
cs.RO · 524 papers123D: Unifying Multi-Modal Autonomous Driving Data at Scale
123D is an open-source framework that unifies diverse multi-modal autonomous driving datasets through a single API, enabling scalable data access.
6D Pose Estimation via Keypoint Heatmap Regression with RGB-D Residual Neural Networks
This paper proposes a 6D pose estimation framework using keypoint heatmap regression, achieving high accuracy with RGB-D fusion.
Active Embodiment Identification with Reinforcement Learning for Legged Robots
A method using reinforcement learning to actively identify legged robot embodiment parameters via interaction.
Evaluation of an Actuated Spine in Agile Quadruped Locomotion
This paper empirically shows that an actuated spine significantly enhances the agility and obstacle negotiation capabilities of quadruped robots.
TAVIS: A Benchmark for Egocentric Active Vision and Anticipatory Gaze in Imitation Learning
TAVIS is a new benchmark for active vision in imitation learning, offering task suites and metrics to evaluate gaze control in robotic manipulation.
AERO-VIS: Asynchronous Event-based Real-time Onboard Visual-Inertial SLAM
AERO-VIS is an asynchronous, real-time event-inertial SLAM system enabling accurate onboard UAV control and state estimation.
Melding LLM and temporal logic for reliable human-swarm collaboration in complex scenarios
This paper introduces a neuro-symbolic framework combining LLMs and temporal logic for reliable, low-overhead human-swarm collaboration in dynamic environments.
Many-to-Many Multi-Agent Pickup and Delivery
This paper introduces M2M, a novel algorithm for many-to-many multi-agent pickup and delivery in warehouses, outperforming prior methods.
Text-to-CAD Evaluation with CADTests
Introduces CADTestBench, the first test-based benchmark using CADTests for evaluating and guiding Text-to-CAD model generation.
NoiseGate: Learning Per-Latent Timestep Schedules as Information Gating in World Action Models
NoiseGate introduces a learnable per-latent timestep schedule as an information-gating policy for World Action Models, improving robot manipulation.
Sensitivity-Based Robust NMPC for Close-Proximity Offshore Wind Turbine Inspection with a Tilted Multirotor
A sensitivity-based robust NMPC is proposed for close-proximity offshore wind turbine inspection, preventing safety violations under uncertainties.
CommandSwarm: Safety-Aware Natural Language-to-Behavior-Tree Generation for Robotic Swarms
CommandSwarm enables safety-aware natural language control of robotic swarms by generating validated behavior trees using adapted LLMs.
Offline-Online Hierarchical 3D Global Relocalization With Synthetic LiDAR Sensing and Descriptor-Space Retrieval
This paper introduces an offline-online hierarchical framework for fast 3D global relocalization using synthetic LiDAR and descriptor-space retrieval.
Drifting Field Policy: A One-Step Generative Policy via Wasserstein Gradient Flow
DFP is a new one-step generative policy using Wasserstein gradient flow, achieving state-of-the-art performance on manipulation tasks.
Finite-Time Analysis of MCTS in Continuous POMDP Planning
This paper provides a finite-time analysis for MCTS in POMDPs, introducing Voro-POMCPOW for continuous observation spaces with theoretical guarantees.
PhySPRING: Structure-Preserving Reduction of Physics-Informed Twins via GNN
PhySPRING uses a GNN to efficiently reduce the complexity of physics-informed digital twins, preserving structure for faster, high-fidelity simulations.
Operating Within the Operational Design Domain: Zero-Shot Perception with Vision-Language Models
This paper demonstrates how Vision-Language Models can perform zero-shot perception of Operational Design Domain elements, enhancing safety for autonomous systems.
BrickCraft: Visuomotor Skill Composition with Situated Manual Guidance for Long-Horizon Interlocking Brick Assembly
BrickCraft is a compositional framework enabling robots to assemble complex interlocking brick structures by decomposing tasks into reusable, spatially guided skills.
MemCompiler: Compile, Don't Inject -- State-Conditioned Memory for Embodied Agents
MemCompiler dynamically compiles state-conditioned memory for embodied agents, improving performance and efficiency over static memory injection.
How to utilize failure demo data?: Effective data selection for imitation learning using distribution differences in attention mechanism
This paper proposes a method to effectively use failure demonstration data in imitation learning by learning success-failure discrepancies in attention mechanisms.
📬 Weekly AI Paper Digest
Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.