Identity as Attractor: Geometric Evidence for Persistent Agent Architecture in LLM Activation Space
arXiv cs.AI / 4/15/2026
💬 OpinionIdeas & Deep AnalysisModels & Research
Key Points
- The paper investigates whether a persistent agent “cognitive_core” identity document (identity prompt) induces attractor-like dynamics in an LLM’s activation space.
- Using controlled comparisons on Llama 3.1 8B Instruct (original core vs paraphrases vs structurally matched controls), mean-pooled hidden states at layers 8, 16, and 24 show paraphrases converge to a significantly tighter cluster than controls.
- Replication on Gemma 2 9B supports cross-architecture generalizability, suggesting the effect is not limited to a single model family.
- Ablation results indicate the phenomenon is driven mainly by semantic content rather than structural matching, and structural completeness is needed to reach the attractor region.
- An exploratory test shows that merely reading a scientific description of the agent shifts activations toward the attractor more than a sham preprint, implying a difference between “knowing about” an identity and “operating as” that identity.
Related Articles

RAG in Practice — Part 4: Chunking, Retrieval, and the Decisions That Break RAG
Dev.to
Why dynamically routing multi-timescale advantages in PPO causes policy collapse (and a simple decoupled fix) [R]
Reddit r/MachineLearning

How AI Interview Assistants Are Changing Job Preparation in 2026
Dev.to

Consciousness in Artificial Intelligence: Insights from the Science ofConsciousness
Dev.to

NEW PROMPT INJECTION
Dev.to