Incremental Semantics-Aided Meshing from LiDAR-Inertial Odometry and RGB Direct Label Transfer
Muhammad Affan, Ville Lehtola, George Vosselman
TLDR
A new RGB+LiDAR pipeline uses vision semantics to incrementally generate high-fidelity 3D meshes, improving reconstruction quality over state-of-the-art.
Key contributions
- Proposes an incremental RGB+LiDAR pipeline for high-fidelity 3D mesh reconstruction in complex indoor scenes.
- Leverages a vision foundation model for direct label transfer from RGB frames onto LiDAR-inertial maps.
- Uses semantics-aware TSDF fusion to resolve geometric ambiguities and improve mesh quality at boundaries.
- Outperforms state-of-the-art geometric baselines (ImMesh, Voxblox) in reconstruction quality.
Why it matters
The paper addresses a critical challenge in 3D reconstruction: generating high-fidelity meshes in complex indoor environments. By integrating visual semantics with LiDAR data, it significantly improves geometric accuracy, especially at object boundaries. This advancement is crucial for creating detailed digital assets for XR, digital modeling, and Universal Scene Description.
Original Abstract
Geometric high-fidelity mesh reconstruction from LiDAR-inertial scans remains challenging in large, complex indoor environments -- such as cultural buildings -- where point cloud sparsity, geometric drift, and fixed fusion parameters produce holes, over-smoothing, and spurious surfaces at structural boundaries. We propose a modular, incremental RGB+LiDAR pipeline that generates incremental semantics-aided high-quality meshes from indoor scans through scan frame-based direct label transfer. A vision foundation model labels each incoming RGB frame; labels are incrementally projected and fused onto a LiDAR-inertial odometry map; and an incremental semantics-aware Truncated Signed Distance Function (TSDF) fusion step produces the final mesh via marching cubes. This frame-level fusion strategy preserves the geometric fidelity of LiDAR while leveraging rich visual semantics to resolve geometric ambiguities at reconstruction boundaries caused by LiDAR point-cloud sparsity and geometric drift. We demonstrate that semantic guidance improves geometric reconstruction quality; quantitative evaluation is therefore performed using geometric metrics on the Oxford Spires dataset, while results from the NTU VIRAL dataset are analyzed qualitatively. The proposed method outperforms state-of-the-art geometric baselines ImMesh and Voxblox, demonstrating the benefit of semantics-aided fusion for geometric mesh quality. The resulting semantically labelled meshes are of value when reconstructing Universal Scene Description (USD) assets, offering a path from indoor LiDAR scanning to XR and digital modeling.
📬 Weekly AI Paper Digest
Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.