AI Navigate

Comparative Analysis of Deep Learning Architectures for Multi-Disease Classification of Single-Label Chest X-rays

arXiv cs.CV / 3/17/2026

📰 NewsIdeas & Deep AnalysisModels & Research

Key Points

  • The study systematically compared seven architectures (ConvNeXt-Tiny, DenseNet121/201, ResNet50, ViT-B/16, EfficientNetV2-M, MobileNetV2) for multi-class chest X-ray classification across five disease categories on a balanced dataset of 18,080 images.
  • All models surpassed 90% test accuracy, with ConvNeXt-Tiny achieving the best overall performance (92.31% accuracy, 95.70% AUROC).
  • MobileNetV2 offered the best parameter efficiency (3.5M parameters) with 90.42% accuracy and 94.10% AUROC, and trained in 48 minutes.
  • Tuberculosis and COVID-19 achieved near-perfect AUROC (>=99.97%) across all architectures, while Normal, Cardiomegaly, and Pneumonia remained more challenging due to overlapping features.
  • Grad-CAM visualizations indicated clinically consistent attention patterns, supporting model interpretability for AI-assisted diagnosis in diverse healthcare settings.

Abstract

Chest X-ray imaging remains the primary diagnostic tool for pulmonary and cardiac disorders worldwide, yet its accuracy is hampered by radiologist shortages and inter-observer variability. This study presents a systematic comparative evaluation of seven deep learning architectures for multi-class chest disease classification: ConvNeXt-Tiny, DenseNet121, DenseNet201, ResNet50, ViT-B/16, EfficientNetV2-M, and MobileNetV2. A balanced dataset of 18,080 chest X-ray images spanning five disease categories (Cardiomegaly, COVID-19, Normal, Pneumonia, and Tuberculosis) was constructed from three public repositories and partitioned at the patient level to prevent data leakage. All models were trained under identical conditions using ImageNet-pretrained weights, standardized preprocessing, and consistent hyperparameters. All seven architectures exceeded 90% test accuracy. ConvNeXt-Tiny achieved the highest performance (92.31% accuracy, 95.70% AUROC), while MobileNetV2 emerged as the most parameter-efficient model (3.5M parameters, 90.42% accuracy, 94.10% AUROC), completing training in 48 minutes. Tuberculosis and COVID-19 classification was near-perfect (AUROC >= 99.97%) across all architectures, while Normal, Cardiomegaly, and Pneumonia presented greater challenges due to overlapping radiographic features. Grad-CAM visualizations confirmed clinically consistent attention patterns across disease categories. These findings demonstrate that high-accuracy multi-disease chest X-ray classification is achievable without excessive computational resources, with important implications for AI-assisted diagnosis in both resource-rich and resource-constrained healthcare settings.