Closing the Domain Gap in Biomedical Imaging by In-Context Control Samples
Ana Sanchez-Fernandez, Thomas Pinetz, Werner Zellinger, Günter Klambauer
TLDR
CS-ARM-BN uses control samples to close domain gaps in biomedical imaging, improving deep learning robustness across batches.
Key contributions
- Introduces CS-ARM-BN, a meta-learning method leveraging negative control samples for adaptation.
- Validates on MoA classification with JUMP-CP dataset, closing accuracy gap from 0.862 to 0.935.
- Outperforms standard ResNets and foundation models failing to handle batch effects.
- Stabilizes meta-learning under strong domain shifts using always-available control samples.
Why it matters
Batch effects hinder reproducibility and deep learning in biomedical imaging. This method uses inherent control samples to adapt models, enabling reliable cross-batch performance and practical deployment.
Original Abstract
The central problem in biomedical imaging are batch effects: systematic technical variations unrelated to the biological signal of interest. These batch effects critically undermine experimental reproducibility and are the primary cause of failure of deep learning systems on new experimental batches, preventing their practical use in the real world. Despite years of research, no method has succeeded in closing this performance gap for deep learning models. We propose Control-Stabilized Adaptive Risk Minimization via Batch Normalization (CS-ARM-BN), a meta-learning adaptation method that exploits negative control samples. Such unperturbed reference images are present in every experimental batch by design and serve as stable context for adaptation. We validate our novel method on Mechanism-of-Action (MoA) classification, a crucial task for drug discovery, on the large-scale JUMP-CP dataset. The accuracy of standard ResNets drops from 0.939 $\pm$ 0.005, on the training domain, to 0.862 $\pm$ 0.060 on data from new experimental batches. Foundation models, even after Typical Variation Normalization, fail to close this gap. We are the first to show that meta-learning approaches close the domain gap by achieving 0.935 $\pm$ 0.018. If the new experimental batches exhibit strong domain shifts, such as being generated in a different lab, meta-learning approaches can be stabilized with control samples, which are always available in biomedical experiments. Our work shows that batch effects in bioimaging data can be effectively neutralized through principled in-context adaptation, which also makes them practically usable and efficient.
📬 Weekly AI Paper Digest
Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.