ArXiv TLDR

DSVTLA: Deep Swin Vision Transformer-Based Transfer Learning Architecture for Multi-Type Cancer Histopathological Cancer Image Classification

🐦 Tweet
2604.09468

Muazzem Hussain Khan, Tasdid Hasnain, Md. Jamil khan, Ruhul Amin, Md. Shamim Reza + 2 more

eess.IVcs.CV

TLDR

DSVTLA, a deep Swin Vision Transformer and ResNet50 architecture, achieves superior multi-cancer histopathological image classification with high accuracy.

Key contributions

  • Proposes DSVTLA, integrating Swin Transformer with ResNet50 for robust multi-cancer image classification.
  • Validated on a comprehensive dataset covering breast, oral, lung, colon, kidney, and leukemia cancers.
  • Achieved high accuracy: 100% for lung-colon and segmented leukemia, 99.23% for breast cancer.
  • Outperformed several state-of-the-art CNN and Vision Transformer models in benchmarks.

Why it matters

This paper introduces a highly accurate and robust AI system for multi-cancer diagnosis from histopathological images. Its superior performance across diverse cancer types and imaging conditions offers a strong benchmark. This could significantly aid clinical decision-making and accelerate AI-assisted pathology.

Original Abstract

In this study, we proposed a deep Swin-Vision Transformer-based transfer learning architecture for robust multi-cancer histopathological image classification. The proposed framework integrates a hierarchical Swin Transformer with ResNet50-based convolution features extraction, enabling the model to capture both long-range contextual dependencies and fine-grained local morphological patterns within histopathological images. To validate the efficiency of the proposed architecture, an extensive experiment was executed on a comprehensive multi-cancer dataset including Breast Cancer, Oral Cancer, Lung and Colon Cancer, Kidney Cancer, and Acute Lymphocytic Leukemia (ALL), including both original and segmented images were analyzed to assess model robustness across heterogeneous clinical imaging conditions. Our approach is benchmarked alongside several state-of-the-art CNN and transfer models, including DenseNet121, DenseNet201, InceptionV3, ResNet50, EfficientNetB3, multiple ViT variants, and Swin Transformer models. However, all models were trained and validated using a unified pipeline, incorporating balanced data preprocessing, transfer learning, and fine-tuning strategies. The experimental results demonstrated that our proposed architecture consistently gained superior performance, reaching 100% test accuracy for lung-colon cancer, segmented leukemia datasets, and up to 99.23% accuracy for breast cancer classification. The model also achieved near-perfect precision, f1 score, and recall, indicating highly stable scores across divers cancer types. Overall, the proposed model establishes a highly accurate, interpretable, and also robust multi-cancer classification system, demonstrating strong benchmark for future research and provides a unified comparative assessment useful for designing reliable AI-assisted histopathological diagnosis and clinical decision-making.

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.