Learning ECG Image Representations via Dual Physiological-Aware Alignments

arXiv cs.LG / 4/3/2026

💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper introduces ECG-Scan, a self-supervised framework that learns clinically generalized ECG representations from ECG images when raw signal recordings are unavailable.
  • ECG-Scan uses dual physiological-aware alignments, combining multimodal contrastive learning between ECG images and gold-standard signal-text representations.
  • It incorporates domain knowledge via “soft-lead constraints” to regularize reconstruction and improve consistency across ECG leads.
  • Benchmarking across multiple datasets and downstream tasks shows the image-based model outperforms existing image baselines and reduces the performance gap relative to signal-based analysis.
  • The authors position the approach as a way to leverage large-scale legacy ECG image data to broaden access to automated cardiovascular diagnostics, especially in resource-constrained settings.

Abstract

Electrocardiograms (ECGs) are among the most widely used diagnostic tools for cardiovascular diseases, and a large amount of ECG data worldwide appears only in image form. However, most existing automated ECG analysis methods rely on access to raw signal recordings, limiting their applicability in real-world and resource-constrained settings. In this paper, we present ECG-Scan, a self-supervised framework for learning clinically generalized representations from ECG images through dual physiological-aware alignments: 1) Our approach optimizes image representation learning using multimodal contrastive alignment between image and gold-standard signal-text modalities. 2) We further integrate domain knowledge via soft-lead constraints, regularizing the reconstruction process and improving signal lead inter-consistency. Extensive benchmarking across multiple datasets and downstream tasks demonstrates that our image-based model achieves superior performance compared to existing image baselines and notably narrows the gap between ECG image and signal analysis. These results highlight the potential of self-supervised image modeling to unlock large-scale legacy ECG data and broaden access to automated cardiovascular diagnostics.