Large Language Models Align with the Human Brain during Creative Thinking

arXiv cs.CL / 4/7/2026

💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research

共有:

Key Points

The study uses fMRI data from 170 participants performing the Alternate Uses Task to examine how large language model (LLM) representations align with human brain activity during creative thinking, using Representational Similarity Analysis (RSA).
Brain-LLM alignment is found to scale with LLM size (notably in the default mode network) and with idea originality (in both the default mode and frontoparietal networks), with the strongest alignment effects occurring early in the creative process.
Different post-training objectives produce distinct, functionally selective changes in alignment: a creativity-optimized Llama variant preserves alignment with high-creativity neural responses while weakening alignment with low-creativity ones.
A model fine-tuned to human behavior increases alignment with both high- and low-creativity neural responses, whereas a reasoning-trained variant shifts alignment away from the creative neural geometry toward more analytical processing patterns.

Abstract

Creative thinking is a fundamental aspect of human cognition, and divergent thinking-the capacity to generate novel and varied ideas-is widely regarded as its core generative engine. Large language models (LLMs) have recently demonstrated impressive performance on divergent thinking tests and prior work has shown that models with higher task performance tend to be more aligned to human brain activity. However, existing brain-LLM alignment studies have focused on passive, non-creative tasks. Here, we explore brain alignment during creative thinking using fMRI data from 170 participants performing the Alternate Uses Task (AUT). We extract representations from LLMs varying in size (270M-72B) and measure alignment to brain responses via Representational Similarity Analysis (RSA), targeting the creativity-related default mode and frontoparietal networks. We find that brain-LLM alignment scales with model size (default mode network only) and idea originality (both networks), with effects strongest early in the creative process. We further show that post-training objectives shape alignment in functionally selective ways: a creativity-optimized \texttt{Llama-3.1-8B-Instruct} preserves alignment with high-creativity neural responses while reducing alignment with low-creativity ones; a human behavior fine-tuned model elevates alignment with both; and a reasoning-trained variant shows the opposite pattern, suggesting chain-of-thought training steers representations away from creative neural geometry toward analytical processing. These results demonstrate that post-training objectives selectively reshape LLM representations relative to the neural geometry of human creative thought.