Robustness of Vision Foundation Models to Common Perturbations
Hongbin Liu, Zhengyuan Jiang, Cheng Hong, Neil Zhenqiang Gong
TLDR
This paper systematically studies vision foundation model robustness to common image perturbations, finding them generally non-robust, and proposes a fine-tuning solution.
Key contributions
- Systematically studies vision foundation model robustness to common image perturbations.
- Proposes three new robustness metrics with five desired mathematical properties.
- Evaluates six industry-scale VFMs, finding them generally non-robust to common edits.
- Shows perturbations degrade downstream performance and proposes robustness-improving fine-tuning.
Why it matters
Vision foundation models are crucial for many applications, but their susceptibility to common image edits can severely impact real-world performance. This work provides critical insights into their limitations, offers new evaluation tools, and proposes a practical solution to build more reliable AI systems.
Original Abstract
A vision foundation model outputs an embedding vector for an image, which can be affected by common editing operations (e.g., JPEG compression, brightness, contrast adjustments). These common perturbations alter embedding vectors and may impact the performance of downstream tasks using these embeddings. In this work, we present the first systematic study on foundation models' robustness to such perturbations. We propose three robustness metrics and formulate five desired mathematical properties for these metrics, analyzing which properties they satisfy or violate. Using these metrics, we evaluate six industry-scale foundation models (OpenAI, Meta) across nine common perturbation categories, finding them generally non-robust. We also show that common perturbations degrade downstream application performance (e.g., classification accuracy) and that robustness values can predict performance impacts. Finally, we propose a fine-tuning approach to improve robustness without sacrificing utility.
📬 Weekly AI Paper Digest
Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.