A Multimodal Pre-trained Network for Integrated EEG-Video Seizure Detection

arXiv cs.CV / 4/30/2026

📰 NewsIdeas & Deep AnalysisModels & Research

共有:

Key Points

The paper introduces EEGVFusion, a multimodal seizure-detection framework that integrates synchronized EEG and video to address weaknesses of single-modality approaches.
EEGVFusion combines self-supervised EEG representation learning, spatio-temporal video encoding, optimal-transport (OT) alignment, and bidirectional cross-attention to fuse neural and behavioral evidence.
The authors curated an expert-annotated EEG-video dataset covering 93 sessions from 15 mice to train and evaluate the model.
On a random-session split, EEGVFusion reports very high Balanced Accuracy (0.9957) with perfect event sensitivity and a low event false-alarm rate (0.6250 FP/h).
In a held-out-subject test (Subject 110), it maintains strong performance (Balanced Accuracy 0.9718) and substantially lowers Event FAR versus an EEG-only baseline (2.7250 to 0.4833 FP/h), with ablations indicating EEG pre-training and OT alignment reduce false alarms without harming sensitivity.

Abstract

Reliable seizure detection in mouse models is essential for preclinical epilepsy research, yet manual review of synchronized video-EEG recordings is labor-intensive and single-modality systems fail for complementary reasons: video-based methods are easily confounded by benign behaviors, whereas EEG-based methods are vulnerable to ictal motion artifacts. We present EEGVFusion, a multimodal framework that combines self-supervised EEG representation learning, spatio-temporal video encoding, optimal-transport alignment, and bidirectional cross-attention to integrate neural and behavioral evidence. We also curate an expert-annotated dataset of synchronized EEG and video recordings comprising 93 sessions from 15 mice for training and evaluation. In the random-session split, EEGVFusion achieved a Balanced Accuracy of 0.9957 with perfect event sensitivity and an Event FAR of 0.6250 FP/h, indicating strong seizure detection performance with a low false-alarm burden. In a single held-out-subject evaluation with Subject 110 reserved for testing, EEGVFusion achieved a Balanced Accuracy of 0.9718 and reduced Event FAR from 2.7250 FP/h for the EEG-only counterpart to 0.4833 FP/h while preserving perfect event sensitivity. Targeted ablations further showed that EEG pre-training and OT alignment help reduce false alarms while preserving event sensitivity.