ECG-Lens: Benchmarking ML & DL Models on PTB-XL Dataset

arXiv cs.LG / 4/20/2026

📰 NewsModels & Research

Key Points

  • The paper introduces ECG-Lens as a benchmark study comparing three traditional ML models (Decision Tree, Random Forest, Logistic Regression) with three deep learning architectures (Simple CNN, LSTM, and ECGLens) for ECG signal classification on PTB-XL.
  • It trains DL models directly on raw 12-lead ECG signals from PTB-XL, aiming to let networks automatically learn discriminative features relevant to different cardiac conditions.
  • Stationary Wavelet Transform (SWT) data augmentation is used to improve performance by enriching training diversity while retaining key ECG characteristics.
  • Across multiple evaluation metrics (accuracy, precision, recall, F1-score, ROC-AUC), ECG-Lens achieves the best results, reaching about 80% accuracy and 90% ROC-AUC.
  • The authors conclude that complex CNN-based deep learning models can substantially outperform traditional ML approaches on raw 12-lead ECG data and offer guidance for selecting automated ECG classifiers and planning condition-specific model development.

Abstract

Automated classification of electrocardiogram (ECG) signals is a useful tool for diagnosing and monitoring cardiovascular diseases. This study compares three traditional machine learning algorithms (Decision Tree Classifier, Random Forest Classifier, and Logistic Regression) and three deep learning models (Simple Convolutional Neural Network (CNN), Long Short-Term Memory (LSTM), and Complex CNN (ECGLens)) for the classification of ECG signals from the PTB-XL dataset, which contains 12-lead recordings from normal patients and patients with various cardiac conditions. The DL models were trained on raw ECG signals, allowing them to automatically extract discriminative features. Data augmentation using the Stationary Wavelet Transform (SWT) was applied to enhance model performance, increase the diversity of training samples, and preserve the essential characteristics of the ECG signals. The models were evaluated using multiple metrics, including accuracy, precision, recall, F1-score, and ROC-AUC. The ECG-Lens model achieved the highest performance, with 80% classification accuracy and a 90% ROC-AUC. These findings demonstrate that deep learning architectures, particularly complex CNNs substantially outperform traditional ML methods on raw 12-lead ECG data, and provide a practical benchmark for selecting automated ECG classification models and identifying directions for condition-specific model development.