Seeing Through the Tool: A Controlled Benchmark for Occlusion Robustness in Foundation Segmentation Models
Nhan Ho, Luu Le, Thanh-Huy Nguyen, Thien Nguyen, Xiaofeng Liu + 1 more
TLDR
OccSAM-Bench evaluates foundation segmentation models' robustness to surgical occlusion, revealing distinct behaviors and guiding model selection for clinical intent.
Key contributions
- Introduces OccSAM-Bench, a benchmark for evaluating SAM models under controlled surgical occlusion.
- Simulates two occlusion types (tool overlay, cutout) with three severity levels on polyp datasets.
- Proposes a novel three-region evaluation protocol for full, visible-only, and invisible targets.
- Reveals two model archetypes: Occluder-Aware (visible tissue) and Occluder-Agnostic (amodal).
Why it matters
This paper addresses the critical, underexplored challenge of occlusion in clinical endoscopy for foundation segmentation models. OccSAM-Bench provides a systematic evaluation framework, revealing distinct model behaviors under occlusion. This guides informed model selection based on specific clinical intent, improving reliability in surgical settings.
Original Abstract
Occlusion, where target structures are partially hidden by surgical instruments or overlapping tissues, remains a critical yet underexplored challenge for foundation segmentation models in clinical endoscopy. We introduce OccSAM-Bench, a benchmark designed to systematically evaluate SAM-family models under controlled, synthesized surgical occlusion. Our framework simulates two occlusion types (i.e., surgical tool overlay and cutout) across three calibrated severity levels on three public polyp datasets. We propose a novel three-region evaluation protocol that decomposes segmentation performance into full, visible-only, and invisible targets. This metric exposes behaviors that standard amodal evaluation obscures, revealing two distinct model archetypes: Occluder-Aware models (SAM, SAM 2, SAM 3, MedSAM3), which prioritize visible tissue delineation and reject instruments, and Occluder-Agnostic models (MedSAM, MedSAM2), which confidently predict into occluded regions. SAM-Med2D aligns with neither and underperforms across all conditions. Ultimately, our results demonstrate that occlusion robustness is not uniform across architectures, and model selection must be driven by specific clinical intent-whether prioritizing conservative visible-tissue segmentation or the amodal inference of hidden anatomy.
📬 Weekly AI Paper Digest
Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.