AOI-SSL: Self-Supervised Framework for Efficient Segmentation of Wire-bonded Semiconductors In Optical Inspection

May 12, 20262605.12430

Joaquín Figueira, Rob Van Gastel, Giacomo D'Amicantonio, Zhuoran Liu, Ioan Gabriel Bucur + 2 more

cs.CV

TLDR

AOI-SSL is a self-supervised framework for efficient semantic segmentation of wire-bonded semiconductors, reducing labeled data needs and improving adaptation.

Key contributions

Introduces AOI-SSL, combining self-supervised pre-training and in-context inference for semiconductor segmentation.
Masked Autoencoders are most effective for small-data self-supervised pre-training in this industrial domain.
Proposes patch-level retrieval for mask prediction, enabling near-instant adaptation to single device images.
Self-supervised pre-training significantly improves segmentation quality over training from scratch or ImageNet.

Why it matters

Current semiconductor inspection models are device-specific and require costly retraining. AOI-SSL offers a training-efficient solution by minimizing labeled data needs and enabling rapid adaptation to new devices or shifts. This significantly reduces operational overhead in automated optical inspection.

Original Abstract

Segmentation models in automated optical inspection of wire-bonded semiconductors are typically device-specific and must be re-trained when new devices or distribution shifts appear. We introduce AOI-SSL, a training-efficient framework for semantic segmentation of wire-bonded semiconductors by combining small-domain self-supervised pre-training of vision transformers with in-context inference that minimizes the need of labeled examples. We pre-train SOTA self-supervised algorithms in a small industrial inspection dataset and find that Masked Autoencoders are the most effective in this small-data setting, improving downstream segmentation while reducing the labeled fine-tuning effort. We further introduce in-context, patch-level retrieval methods that predict masks directly from dense encoder embeddings with negligible additional training. We show that, in this setting, simple similarity-based retrieval performs on par with more complex attention-based aggregation used currently in the literature. Furthermore, our experiments demonstrate that self-supervised pre-training significantly improves segmentation quality compared to training from scratch and to ImageNet pre-trained backbones under a fixed fine-tuning computational budget. Finally, the results reveal that retrieval based segmentation outperforms fine-tuning when targeting single device images, allowing for near-instant adaptation to difficult samples.

View on arXiv Download PDF

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.

TLDR

Key contributions

Why it matters

Original Abstract

📬 Weekly AI Paper Digest

Related papers