Height-Guided Projection Reparameterization for Camera-LiDAR Occupancy
Yuan Wu, Zhiqiang Yan, Jiawei Lian, Zhengxue Wang, Jian Yang
TLDR
HiPR enhances 3D occupancy prediction by adaptively reparameterizing projection space using height-guided LiDAR features, achieving SOTA performance.
Key contributions
- Introduces HiPR, a camera-LiDAR framework for 3D occupancy prediction.
- Employs a BEV height map from LiDAR to adaptively reparameterize projection space.
- Redistributes projected points to geometrically meaningful regions, improving feature aggregation.
- Uses Progressive Height Conditioning to stabilize training with noisy LiDAR-derived heights.
Why it matters
This paper addresses the challenge of fixed projection spaces in 3D occupancy, which struggle with scene sparsity and height variations. HiPR's adaptive reparameterization leads to more accurate and robust scene understanding, achieving state-of-the-art results in real-time, critical for autonomous driving.
Original Abstract
3D occupancy prediction aims to infer dense, voxel-wise scene semantics from sensor observations, where the 2D-to-3D view transformation serves as a crucial step in bridging image features and volumetric representations. Most previous methods rely on a fixed projection space, where 3D reference points are uniformly sampled along pillars. However, such sampling struggles to capture the sparsity and height variations of real-world scenes, leading to ambiguous correspondences and unreliable feature aggregation. To address these challenges, we propose HiPR, a camera-LiDAR occupancy framework with Height-Guided Projection Reparameterization. HiPR first encodes LiDAR into a BEV height map to capture the maximum height of the point cloud. HiPR then adjusts the sampling range of each pillar using the height prior, enabling adaptive reparameterization of the projection space. As a result, the projected points are redistributed into geometrically meaningful regions rather than fixed ranges. Meanwhile, we mask out the invalid parts of the height map to avoid misleading the feature aggregation. In addition, to alleviate the training instability caused by noisy LiDAR-derived heights, we introduce a training-time Progressive Height Conditioning strategy, which gradually transitions the conditioning signal from ground-truth heights to LiDAR heights. Extensive experiments demonstrate that HiPR consistently outperforms existing state-of-the-art methods while maintaining real-time inference. The code and pretrained models can be found at https://github.com/Rayn-Wu/HiPR.
📬 Weekly AI Paper Digest
Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.