Riding Brainwaves in LLM Space: Understanding Activation Patterns Using Individual Neural Signatures

arXiv cs.CL / 3/24/2026

📰 NewsSignals & Early TrendsIdeas & Deep AnalysisModels & Research

Key Points

  • The study investigates whether frozen LLM hidden representations (tested with Qwen 2.5 7B and LLaMA 3.1 8B) contain person-specific “activation directions” that predict individual EEG responses to naturalistic sentence reading.
  • Using word-level EEG from 30 participants (ZuCo corpus), the researchers train separate linear probes per person and find person-specific probes outperform a single population-level probe across EEG features, with large gains for high-gamma power (rho 0.183 vs. 0.020).
  • Control analyses indicate the effect is not explained by non-cognitive confounds such as fixation count, which showed no meaningful person-specific advantage.
  • The identified individual neural directions are temporally stable, largely non-transferable across individuals, and remain predictive even after removing the shared population component.
  • The person-specific signal is strongest in the LLM’s deep layers (peaking around Layer 24 of 28) and appears to provide a geometric foundation for EEG-driven personalization.

Abstract

Consumer-grade EEG is entering everyday devices, from earbuds to headbands, raising the question of whether language models can be adapted to individual neural responses. We test this by asking whether frozen LLM representations encode person-specific EEG signals, directions in activation space that predict one person's brain activity but not another's. Using word-level EEG from 30 participants reading naturalistic sentences (ZuCo corpus), we train a separate linear probe for each person, mapping hidden states from a frozen Qwen 2.5 7B to that individual's EEG power. Person-specific probes outperform a single population probe on every EEG feature tested; for high-gamma power, the person-specific probe achieves rho = 0.183, a ninefold improvement over the population probe (rho = 0.020, p < 10^-4). A negative control, fixation count, shows no person-specific advantage (p = 0.360); fixation count reflects word length and frequency rather than individual cognition. The individual directions are temporally stable (split-half cosine = 0.824), non-transferable across people (self rho = 0.369 vs. other rho = 0.143, p < 10^-19), and distinct from the shared population signal: person-specific probes retain predictive power after the population component is removed. The person-specific signal concentrates in the model's deep layers, rising consistently with depth and peaking at Layer 24 of 28. The results are consistent across architectures (LLaMA 3.1 8B) and survive word-level confound controls. Frozen language models contain stable, person-specific neural directions in their deep layers, providing a geometric foundation for EEG-driven personalization.