EEG2Vision: A Multimodal EEG-Based Framework for 2D Visual Reconstruction in Cognitive Neuroscience

arXiv cs.CV / 4/10/2026

💬 OpinionSignals & Early TrendsModels & Research

共有:

Key Points

The paper introduces EEG2Vision, a modular end-to-end framework that reconstructs 2D images from non-invasive EEG under realistic, low-density electrode setups.
EEG-to-image reconstruction is built on an EEG-conditioned diffusion approach, and a prompt-guided post-reconstruction boosting stage is added to refine geometry and perceptual coherence.
The boosting mechanism uses a multimodal large language model to extract semantic descriptions and then applies image-to-image diffusion to improve visual quality while keeping EEG-grounded structure.
Results show that reducing EEG channels sharply hurts semantic decoding accuracy (e.g., 50-way Top-1 accuracy drops from 89% to 38%), while perceptual reconstruction quality degrades only slightly (e.g., FID from 76.77 to 80.51).
The boosting stage delivers consistent perceptual improvements, including up to 9.71% IS gains in low-channel settings, and a user study indicates participants prefer boosted reconstructions, supporting feasibility for more real-time outside-lab brain-to-image applications.

Abstract

Reconstructing visual stimuli from non-invasive electroencephalography (EEG) remains challenging due to its low spatial resolution and high noise, particularly under realistic low-density electrode configurations. To address this, we present EEG2Vision, a modular, end-to-end EEG-to-image framework that systematically evaluates reconstruction performance across different EEG resolutions (128, 64, 32, and 24 channels) and enhances visual quality through a prompt-guided post-reconstruction boosting mechanism. Starting from EEG-conditioned diffusion reconstruction, the boosting stage uses a multimodal large language model to extract semantic descriptions and leverages image-to-image diffusion to refine geometry and perceptual coherence while preserving EEG-grounded structure. Our experiments show that semantic decoding accuracy degrades significantly with channel reduction (e.g., 50-way Top-1 Acc from 89% to 38%), while reconstruction quality slight decreases (e.g., FID from 76.77 to 80.51). The proposed boosting consistently improves perceptual metrics across all configurations, achieving up to 9.71% IS gains in low-channel settings. A user study confirms the clear perceptual preference for boosted reconstructions. The proposed approach significantly boosts the feasibility of real-time brain-2-image applications using low-resolution EEG devices, potentially unlocking this type of applications outside laboratory settings.

💡 Insights using this article

This article is featured in our daily AI news digest — key takeaways and action items at a glance.

📅 4/10DailyView insight →

Black Hat Asia

AI Business

GLM 5.1 tops the code arena rankings for open models

Reddit r/LocalLLaMA

Big Tech firms are accelerating AI investments and integration, while regulators and companies focus on safety and responsible adoption.

Dev.to

My Bestie Built a Free MCP Server for Job Search — Here's How It Works

Dev.to

can we talk about how AI has gotten really good at lying to you?

Reddit r/artificial

EEG2Vision: A Multimodal EEG-Based Framework for 2D Visual Reconstruction in Cognitive Neuroscience

Key Points

Abstract

💡 Insights using this article

Related Articles

Black Hat Asia

GLM 5.1 tops the code arena rankings for open models

Big Tech firms are accelerating AI investments and integration, while regulators and companies focus on safety and responsible adoption.

My Bestie Built a Free MCP Server for Job Search — Here's How It Works

can we talk about how AI has gotten really good at lying to you?

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer