AI Navigate

CognitionCapturerPro: Towards High-Fidelity Visual Decoding from EEG/MEG via Multi-modal Information and Asymmetric Alignment

arXiv cs.AI / 3/16/2026

📰 NewsSignals & Early TrendsIdeas & Deep AnalysisModels & Research

Key Points

  • CognitionCapturerPro proposes an enhanced framework that fuses EEG with multi-modal priors (images, text, depth, and edges) via collaborative training.
  • It introduces an uncertainty-weighted similarity scoring mechanism to quantify modality-specific fidelity and a fusion encoder for integrating shared representations.
  • The approach employs a simplified alignment module and a pre-trained diffusion model to boost visual reconstruction from EEG.
  • On the THINGS-EEG dataset, it outperforms the original CognitionCapturer, with Top-1 and Top-5 retrieval gains of 25.9% and 10.6%, respectively.
  • The authors provide code at the linked GitHub repository for reproducibility.

Abstract

Visual stimuli reconstruction from EEG remains challenging due to fidelity loss and representation shift. We propose CognitionCapturerPro, an enhanced framework that integrates EEG with multi-modal priors (images, text, depth, and edges) via collaborative training. Our core contributions include an uncertainty-weighted similarity scoring mechanism to quantify modality-specific fidelity and a fusion encoder for integrating shared representations. By employing a simplified alignment module and a pre-trained diffusion model, our method significantly outperforms the original CognitionCapturer on the THINGS-EEG dataset, improving Top-1 and Top-5 retrieval accuracy by 25.9% and 10.6%, respectively. Code is available at: https://github.com/XiaoZhangYES/CognitionCapturerPro.