Deployment-Aligned Low-Precision Neural Architecture Search for Spaceborne Edge AI

April 27, 20262604.24492

Parampuneet Kaur Thind, Vaibhav Katturu, Giacomo Zema, Roberto Del Prete

cs.CVcs.AIcs.ETcs.LGcs.NE

TLDR

This paper introduces a hardware-aware NAS framework that integrates deployment-aligned low-precision training to improve accuracy on edge AI devices.

Key contributions

Integrates deployment-aligned low-precision training directly into hardware-aware Neural Architecture Search (NAS).
Exposes candidate architectures to FP16 numerical constraints during fine-tuning and evaluation phases.
Achieves joint optimization of architectural efficiency and numerical robustness for edge AI.
Recovers two-thirds of the accuracy gap caused by low-precision conversion on spaceborne edge AI tasks.

Why it matters

Current hardware-aware NAS struggles with accuracy loss during low-precision deployment. This paper introduces a method that optimizes for low-precision directly during the search, substantially improving on-device performance. This is vital for robust and efficient AI on resource-constrained edge devices, like those used in space.

Original Abstract

Designing deep networks that meet strict latency and accuracy constraints on edge accelerators increasingly relies on hardware-aware optimization, including neural architecture search (NAS) guided by device-level metrics. Yet most hardware-aware NAS pipelines still optimize architectures under full-precision assumptions and apply low-precision adaptation only after the search, leading to a mismatch between optimization-time behavior and deployment-time execution on low-precision hardware that can substantially degrade accuracy. We address this limitation by integrating deployment-aligned low-precision training directly into hardware-aware NAS. Candidate architectures are exposed to FP16 numerical constraints during fine-tuning and evaluation, enabling joint optimization of architectural efficiency and numerical robustness without modifying the search space or evolutionary strategy. We evaluate the proposed framework on vessel segmentation for spaceborne maritime monitoring, targeting the Intel Movidius Myriad X Visual Processing Unit (VPU). While post-training precision conversion reduces on-device performance from 0.85 to 0.78 mIoU, deployment-aligned low-precision training achieves 0.826 mIoU on-device for the same architecture (95,791 parameters), recovering approximately two-thirds of deployment-induced accuracy gap without increasing model complexity. These results demonstrate that incorporating deployment-consistent numerical constraints into hardware-aware NAS substantially improves robustness and alignment between optimization and deployment for resource-constrained edge Artificial Intelligence (AI).

View on arXiv Download PDF

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.

TLDR

Key contributions

Why it matters

Original Abstract

📬 Weekly AI Paper Digest

Related papers