[Online]Vision Transformer vs. ResNet-101: An Explainable Deep Learning Approach for Breast Cancer Detection in Ultrasound Images

Vision Transformer vs. ResNet-101: An Explainable Deep Learning Approach for Breast Cancer Detection in Ultrasound Images

ID：82 Submission ID：318 View Protection：ATTENDEE Updated Time：2025-12-21 13:01:15 Hits：476 Online

Start Time：2025-12-30 15:00 （Asia/Amman）

Duration：15min

Session：[S7] Track 7: Pattern Recognition, Computer Vision and Image Processing » [S7-2] Track 7: Pattern Recognition, Computer Vision and Image Processing

Presentation File Attachment File

Tips: The file permissions under this presentation are only for participants. You have not logged in yet and cannot view it temporarily.

Abstract

Breast cancer remains a significant global health concern, where early and accurate diagnosis is paramount for improving patient survival rates. This paper presents a comparative analysis of two deep learning architectures, the Convolutional Neural Network (CNN) based ResNet-101 and the Vision Transformer (ViT), for the classification of breast ultrasound images into benign, malignant, and normal categories. Addressing the common challenge of limited data, we employed a data augmentation strategy to expand a benchmark dataset of 780 images to over 10,000 images, creating a robust training set. Both models were trained on this augmented dataset, achieving test accuracies of 98.64% for the Transformer model and 97.57% for Resnet-101 model. The result indicates that the ViT model achieved higher accuracy than the ResNet-101. Furthermore, the existing Deep learning models are black box models. To enhance model transparency and build clinical trust, Gradient-weighted Class Activation Mapping (Grad-CAM), an Explainable AI (XAI) technique, is utilized to generate visual heatmaps, highlighting the specific regions in the ultrasound images that were most influential in the models’ diagnostic decisions. The proposed model harnesses GPU-based parallel infrastructure.

Keywords

Breast Cancer, Deep Learning, ResNet-101, Vision Transformer, Explainable AI, Grad-CAM

Speaker

Lipismita Panigrahi

Assistant Professor SRM University-Amaravati

Submission Author

Manogna Kanukurthi SRM UNIVERSITY, Amaravati

Mahitha Vedampudi SRM UNIVERSITY, Amaravati

Sathwika Nibbaragandla SRM UNIVERSITY, Amaravati

Jahnavi Kamana SRM UNIVERSITY, Amaravati

Lipismita Panigrahi SRM University-Amaravati

Comment submit

All comments

CONTACT US

Email: asiancomnet@usssociety.org

Website & IT Support: hi@aconf.org

Registration Submit Paper