Computational Lesions in Multilingual Language Models Separate Shared and Language-specific Brain Alignment

arXiv cs.CL / 4/14/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

Key Points

  • The study addresses how multilingual language processing is represented in the brain by using six multilingual LLMs as controllable proxies for neural mechanisms.
  • Researchers introduce targeted “computational lesions” by zeroing parameter subsets that are either shared across languages or specific to one language, then compare model behavior to human fMRI.
  • Whole-brain encoding correlation drops sharply (by 60.32%) when a compact shared core is lesioned, indicating a causal role for shared parameters in brain alignment.
  • Language-specific lesions keep cross-language separation in embedding space but reduce brain predictivity for the native/matched language, suggesting embedded specializations.
  • The results support a “shared backbone with language-specialized components” framework and propose a causal approach for multilingual brain–model alignment research.

Abstract

How the brain supports language across different languages is a basic question in neuroscience and a useful test for multilingual artificial intelligence. Neuroimaging has identified language-responsive brain regions across languages, but it cannot by itself show whether the underlying processing is shared or language-specific. Here we use six multilingual large language models (LLMs) as controllable systems and create targeted ``computational lesions'' by zeroing small parameter sets that are important across languages or especially important for one language. We then compare intact and lesioned models in predicting functional magnetic resonance imaging (fMRI) responses during 100 minutes of naturalistic story listening in native English, Chinese and French (112 participants). Lesioning a compact shared core reduces whole-brain encoding correlation by 60.32% relative to intact models, whereas language-specific lesions preserve cross-language separation in embedding space but selectively weaken brain predictivity for the matched native language. These results support a shared backbone with embedded specializations and provide a causal framework for studying multilingual brain-model alignment.