Personality Requires Struggle: Three Regimes of the Baldwin Effect in Neuroevolved Chess Agents

arXiv cs.AI / 4/7/2026

💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper tests whether lifetime (Hebbian) learning can increase behavioral diversity over evolutionary time in neuroevolved chess agents, rather than always reducing it as prior Baldwin-effect theory suggests.
  • Results across multiple seeds show a variance crossover: Hebbian ON agents start with lower cross-seed behavioral variance than Hebbian OFF, but surpass it around generation 34, indicating plasticity’s influence reverses over evolution.
  • The authors find structured and reproducible behavioral divergence between agents—e.g., high disagreement on moves for identical positions and distinct opening repertoires, piece preferences, and game lengths—driven by different, interpretable signal-chain configurations.
  • Three evolutionary “regimes” emerge based on opponent type: an exploration regime (Hebbian ON vs heterogeneous opponents), a lottery regime (Hebbian OFF with elitism lock-in), and a transparent regime (same-model opponents with “brain self-erasure”).
  • A key implication is that self-play systems may suppress the very behavioral diversity (“personality”) needed by selectively eliminating heterogeneity, producing a falsifiable prediction for future experiments.

Abstract

Can lifetime learning expand behavioral diversity over evolutionary time, rather than collapsing it? Prior theory predicts that plasticity reduces variance by buffering organisms against environmental noise. We test this in a competitive domain: chess agents with eight NEAT-evolved neural modules, Hebbian within-game plasticity, and a desirability-domain signal chain with imagination. Across 10~seeds per Hebbian condition, a variance crossover emerges: Hebbian ON starts with lower cross-seed variance than OFF, then surpasses it at generation~34. The crossover trend is monotonic (\r{ho} = 0.91, p < 10^{-6): plasticity's effect on behavioral variance reverses over evolutionary time, initially compressing diversity (consistent with prior predictions) then expanding it as evolved Perception differences are amplified through imagination -- a feedback loop that mutation alone cannot sustain. The result is structured behavioral divergence: evolved agents select different moves on the same positions (62\% disagreement), develop distinct opening repertoires, piece preferences, and game lengths. These are not different sampling policies -- they are reproducible behavioral signatures (ICC > 0.8) with interpretable signal chain configurations. Three regimes appear depending on opponent type: exploration (Hebbian ON, heterogeneous opponent), lottery (Hebbian OFF, elitism lock-in), and transparent (same-model opponent, brain self-erasure). The transparent regime generates a falsifiable prediction: self-play systems may systematically suppress behavioral diversity by eliminating the heterogeneity that personality requires. \textbf{Keywords: Baldwin Effect, neuroevolution, NEAT, Hebbian learning, chess, cognitive architecture, personality emergence, imagination

Personality Requires Struggle: Three Regimes of the Baldwin Effect in Neuroevolved Chess Agents | AI Navigate