Improving Diversity in Black-box Few-shot Knowledge Distillation

April 28, 20262604.25795

Tri-Nhan Vo, Dang Nguyen, Kien Do, Sunil Gupta

cs.CVcs.LG

TLDR

This paper improves black-box few-shot knowledge distillation by using a GAN to generate diverse, high-confidence images, boosting student accuracy.

Key contributions

Addresses black-box few-shot KD where teachers are inaccessible and data is scarce.
Proposes a GAN-based scheme to adaptively select high-confidence images for diversity.
Introduces selected images to adversarial learning on-the-fly to expand the distillation set.
Achieves state-of-the-art results on seven image datasets, significantly boosting student accuracy.

Why it matters

Black-box few-shot KD is a practical but challenging setting. This paper's method enhances data diversity, a crucial factor often overlooked, leading to more robust and accurate student models. This advancement is vital for deploying smaller, efficient models in real-world scenarios with limited data and teacher access.

Original Abstract

Knowledge distillation (KD) is a well-known technique to effectively compress a large network (teacher) to a smaller network (student) with little sacrifice in performance. However, most KD methods require a large training set and internal access to the teacher, which are rarely available due to various restrictions. These challenges have originated a more practical setting known as black-box few-shot KD, where the student is trained with few images and a black-box teacher. Recent approaches typically generate additional synthetic images but lack an active strategy to promote their diversity, a crucial factor for student learning. To address these problems, we propose a novel training scheme for generative adversarial networks, where we adaptively select high-confidence images under the teacher's supervision and introduce them to the adversarial learning on-the-fly. Our approach helps expand and improve the diversity of the distillation set, significantly boosting student accuracy. Through extensive experiments, we achieve state-of-the-art results among other few-shot KD methods on seven image datasets. The code is available at https://github.com/votrinhan88/divbfkd.

View on arXiv Download PDF

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.

TLDR

Key contributions

Why it matters

Original Abstract

📬 Weekly AI Paper Digest

Related papers