Dual Pose-Graph Semantic Localization for Vision-Based Autonomous Drone Racing
David Perez-Saura, Miguel Fernandez-Cortizas, Alvaro J. Gaona, Pascual Campoy
TLDR
A dual pose-graph system improves drone racing localization by fusing odometry and semantic gate detections, significantly reducing drift.
Key contributions
- Introduces a dual pose-graph architecture for robust drone localization in high-speed racing environments.
- Fuses visual-inertial odometry with semantic gate detections to overcome motion blur and feature instability.
- Employs a temporary graph for local gate optimization and a persistent main graph for global consistency.
- Achieves 56-74% ATE reduction over VIO and 10-12% higher accuracy than single-graph baselines.
Why it matters
This paper tackles the challenge of robust drone localization in extreme racing conditions. Its novel dual pose-graph system significantly improves accuracy and reduces drift compared to existing methods, demonstrating real-time performance. This advancement is crucial for developing reliable high-speed autonomous systems.
Original Abstract
Autonomous drone racing demands robust real-time localization under extreme conditions: high-speed flight, aggressive maneuvers, and payload-constrained platforms that often rely on a single camera for perception. Existing visual SLAM systems, while effective in general scenarios, struggle with motion blur and feature instability inherent to racing dynamics, and do not exploit the structured nature of racing environments. In this work, we present a dual pose-graph architecture that fuses odometry with semantic detections for robust localization. A temporary graph accumulates multiple gate observations between keyframes and optimizes them into a single refined constraint per landmark, which is then promoted to a persistent main graph. This design preserves the information richness of frequent detections while preventing graph growth from degrading real-time performance. The system is designed to be sensor-agnostic, although in this work we validate it using monocular visual-inertial odometry and visual gate detections. Experimental evaluation on the TII-RATM dataset shows a 56% to 74% reduction in ATE compared to standalone VIO, while an ablation study confirms that the dual-graph architecture achieves 10% to 12% higher accuracy than a single-graph baseline at identical computational cost. Deployment in the A2RL competition demonstrated that the system performs real-time onboard localization during flight, reducing the drift of the odometry baseline by up to 4.2 m per lap.
📬 Weekly AI Paper Digest
Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.