AI Generalisation Gap In Comorbid Sleep Disorder Staging

arXiv cs.LG / 2026/3/26

💬 オピニオンSignals & Early TrendsIdeas & Deep AnalysisModels & Research

要点

  • The study finds that deep learning models for single-channel EEG sleep staging perform poorly when generalized from healthy subjects to clinical populations with disrupted sleep, using Grad-CAM to interpret failures.
  • It introduces iSLEEPS, a newly clinically annotated ischemic stroke dataset (intended for public release) and evaluates a SE-ResNet plus bidirectional LSTM pipeline for automatic staging.
  • Expert-supported attention/interpretation visualizations indicate the model often relies on EEG regions that are physiologically uninformative in patient data.
  • Statistical and computational analyses show significant differences in sleep architecture between healthy and ischemic stroke cohorts, implying the need for subject-aware or disease-specific approaches.
  • The authors argue that clinical validation is necessary before deploying EEG sleep staging models in real-world, comorbid settings such as stroke.

Abstract

Accurate sleep staging is essential for diagnosing OSA and hypopnea in stroke patients. Although PSG is reliable, it is costly, labor-intensive, and manually scored. While deep learning enables automated EEG-based sleep staging in healthy subjects, our analysis shows poor generalization to clinical populations with disrupted sleep. Using Grad-CAM interpretations, we systematically demonstrate this limitation. We introduce iSLEEPS, a newly clinically annotated ischemic stroke dataset (to be publicly released), and evaluate a SE-ResNet plus bidirectional LSTM model for single-channel EEG sleep staging. As expected, cross-domain performance between healthy and diseased subjects is poor. Attention visualizations, supported by clinical expert feedback, show the model focuses on physiologically uninformative EEG regions in patient data. Statistical and computational analyses further confirm significant sleep architecture differences between healthy and ischemic stroke cohorts, highlighting the need for subject-aware or disease-specific models with clinical validation before deployment. A summary of the paper and the code is available at https://himalayansaswatabose.github.io/iSLEEPS_Explainability.github.io/