ArXiv TLDR

Geometry-Aware State Space Model: A New Paradigm for Whole-Slide Image Representation

🐦 Tweet
2605.05164

Enhui Chai, Sicheng Chen, Tianyi Zhang, Chad Wong, Kecheng Huang + 2 more

cs.CVcs.AI

TLDR

BatMIL introduces a geometry-aware state space model with hybrid hyperbolic-Euclidean representations for improved whole-slide image analysis.

Key contributions

  • Employs a hybrid hyperbolic-Euclidean representation for hierarchical and local WSI features.
  • Utilizes a structured state space model (S4) to efficiently capture long-range patch dependencies.
  • Integrates a chunk-level Mixture-of-Experts (MoE) module for regional WSI heterogeneity.

Why it matters

This paper introduces BatMIL, a novel framework that addresses limitations in current WSI analysis by incorporating geometry-aware representations. By modeling hierarchical and local details, and handling regional heterogeneity, it significantly advances computational pathology, offering more accurate disease diagnosis.

Original Abstract

Accurate analysis of histopathological images is critical for disease diagnosis and treatment planning. Whole-slide images (WSIs), which digitize tissue specimens at gigapixel resolution, are fundamental to this process but require aggregating thousands of patches for slide-level predictions. Multiple Instance Learning (MIL) tackles this challenge with a two-stage paradigm, decoupling tile-level embedding and slide-level prediction. However, most existing methods implicitly embed patch representations in homogeneous Euclidean spaces, overlooking the hierarchical organization and regional heterogeneity of pathological tissues. This limits current models' ability to capture global tissue architecture and fine-grained cellular morphology. To address this limitation, we introduce a hybrid hyperbolic-Euclidean representation that embeds WSI features in dual geometric spaces, enabling complementary modeling of hierarchical tissue structures and local morphological details. Building on this formulation, we develop BatMIL, a WSI classification framework that leverages both geometric spaces. To model long-range dependencies among thousands of patches, we employ a structured state space sequence model (S4) backbone that encodes patch sequences with linear computational complexity. Furthermore, to account for regional heterogeneity, we introduce a chunk-level mixture-of-experts (MoE) module that groups patches into regions and dynamically routes them to specialized subnetworks, improving representational capacity while reducing redundant computation. Extensive experiments on seven WSI datasets spanning six cancer types demonstrate that BatMIL consistently outperforms state-of-the-art MIL approaches in slide-level classification tasks. These results indicate that geometry-aware representation learning offers a promising direction for next-generation computational pathology.

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.