DA-PTQ: Drift-Aware Post-Training Quantization for Efficient Vision-Language-Action Models

April 13, 20262604.11572

Siyuan Xu, Tianshi Wang, Fengling Li, Lei Zhu, Heng Tao Shen

cs.ROcs.MM

TLDR

DA-PTQ enables efficient deployment of Vision-Language-Action models on robots by reducing kinematic drift caused by quantization.

Key contributions

Addresses kinematic drift in VLAs caused by temporal error accumulation during quantization.
Introduces Cross-Space Representation Compensation for consistent multimodal-to-action mapping.
Uses Motion-Driven Mixed-Precision Allocation to minimize trajectory-level motion errors.
Achieves near full-precision performance with low-bit quantization for efficient VLA deployment.

Why it matters

Vision-Language-Action models are critical for embodied AI, but their high demands and quantization-induced kinematic drift hinder deployment on resource-limited robots. DA-PTQ effectively solves this drift problem, enabling practical and efficient deployment of powerful VLAs.

Original Abstract

Vision-Language-Action models (VLAs) have demonstrated strong potential for embodied AI, yet their deployment on resource-limited robots remains challenging due to high memory and computational demands. While Post-Training Quantization (PTQ) provides an efficient solution, directly applying PTQ to VLAs often results in severe performance degradation during sequential control. We identify temporal error accumulation as a key factor, where quantization perturbations at the vision-language-to-action interface are progressively amplified, leading to kinematic drift in executed trajectories. To address this issue, we propose Drift-Aware Post-Training Quantization (DA-PTQ), which formulates quantization as a drift-aware optimization problem over sequential decision processes. DA-PTQ consists of two components: (1) Cross-Space Representation Compensation, which mitigates structured distortions between multimodal representations and action space to improve action consistency, and (2) Motion-Driven Mixed-Precision Allocation, which assigns bit-widths by minimizing trajectory-level motion errors. Extensive experiments show that DA-PTQ significantly reduces kinematic drift and achieves comparable performance to full-precision models under low-bit settings, enabling practical deployment of VLAs on resource-limited robotic platforms.

View on arXiv Download PDF

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.

TLDR

Key contributions

Why it matters

Original Abstract

📬 Weekly AI Paper Digest

Related papers