Mitigating Error Amplification in Fast Adversarial Training

April 27, 20262604.24332

Mengnan Zhao, Lihe Zhang, Bo Wang, Tianhang Zheng, Hong Zhong + 1 more

cs.LGcs.CR

TLDR

A new method, DDG, mitigates catastrophic overfitting and improves robustness-accuracy in Fast Adversarial Training by dynamic guidance.

Key contributions

Identifies low-confidence samples as the primary cause of catastrophic overfitting and robustness-accuracy trade-off.
Proposes Distribution-aware Dynamic Guidance (DDG) for Fast Adversarial Training.
DDG dynamically scales perturbation magnitude based on sample confidence for consistent decision boundaries.
DDG adjusts supervision signals based on prediction state, preventing overemphasis on incorrect labels.

Why it matters

This paper tackles critical issues in Fast Adversarial Training, improving model robustness without sacrificing clean accuracy. By dynamically adjusting training signals, it offers a practical solution to catastrophic overfitting, making AI models more reliable against adversarial attacks.

Original Abstract

Fast Adversarial Training (FAT) has proven effective in enhancing model robustness by encouraging networks to learn perturbation-invariant representations. However, FAT often suffers from catastrophic overfitting (CO), where the model overfits to the training attack and fails to generalize to unseen ones. Moreover, robustness oriented optimization typically leads to notable performance degradation on clean inputs, and such degradation becomes increasingly severe as the perturbation budget grows. In this work, we conduct a comprehensive analysis of how guidance strength affects model performance by modulating perturbation and supervision levels across distinct confidence groups. The findings reveal that low confidence samples are the primary contributors to CO and the robustness accuracy trade off. Building on this insight, we propose a Distribution-aware Dynamic Guidance (DDG) strategy that dynamically adjusts both the perturbation budget and supervision signal. Specifically, DDG scales the perturbation magnitude according to the sample confidence at the ground truth class, thereby guiding samples toward consistent decision boundaries while mitigating the influence of learning spurious correlations. Simultaneously, it dynamically adjusts the supervision signal based on the prediction state of each sample, preventing overemphasis on incorrect signals. To alleviate potential gradient instability arising from dynamic guidance, we further design a weighted regularization constraint. Extensive experiments on standard benchmarks demonstrate that DDG effectively alleviates both CO and the robustness accuracy trade off.

View on arXiv Download PDF

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.

TLDR

Key contributions

Why it matters

Original Abstract

📬 Weekly AI Paper Digest

Related papers