Improving Diversity in Black-box Few-shot Knowledge Distillation
Tri-Nhan Vo, Dang Nguyen, Kien Do, Sunil Gupta
TLDR
This paper improves black-box few-shot knowledge distillation by using a GAN to generate diverse, high-confidence images, boosting student accuracy.
Key contributions
- Addresses black-box few-shot KD where teachers are inaccessible and data is scarce.
- Proposes a GAN-based scheme to adaptively select high-confidence images for diversity.
- Introduces selected images to adversarial learning on-the-fly to expand the distillation set.
- Achieves state-of-the-art results on seven image datasets, significantly boosting student accuracy.
Why it matters
Black-box few-shot KD is a practical but challenging setting. This paper's method enhances data diversity, a crucial factor often overlooked, leading to more robust and accurate student models. This advancement is vital for deploying smaller, efficient models in real-world scenarios with limited data and teacher access.
Original Abstract
Knowledge distillation (KD) is a well-known technique to effectively compress a large network (teacher) to a smaller network (student) with little sacrifice in performance. However, most KD methods require a large training set and internal access to the teacher, which are rarely available due to various restrictions. These challenges have originated a more practical setting known as black-box few-shot KD, where the student is trained with few images and a black-box teacher. Recent approaches typically generate additional synthetic images but lack an active strategy to promote their diversity, a crucial factor for student learning. To address these problems, we propose a novel training scheme for generative adversarial networks, where we adaptively select high-confidence images under the teacher's supervision and introduce them to the adversarial learning on-the-fly. Our approach helps expand and improve the diversity of the distillation set, significantly boosting student accuracy. Through extensive experiments, we achieve state-of-the-art results among other few-shot KD methods on seven image datasets. The code is available at https://github.com/votrinhan88/divbfkd.
📬 Weekly AI Paper Digest
Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.