Contextual Agentic Memory is a Memo, Not True Memory

arXiv cs.AI / 5/1/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

Key Points

  • The paper argues that many “agentic memory” approaches (vector stores, RAG, scratchpads, and context-window management) perform retrieval rather than true memory.
  • It claims that confusing lookup with memory leads to concrete limitations in agent capability, including no effective long-term learning, a provable generalization ceiling on compositional novelty, and inability to overcome these issues merely by increasing context size or improving retrieval quality.
  • The research further warns that persistent memory systems are structurally vulnerable to “memory poisoning,” where injected content can propagate across future sessions.
  • Using Complementary Learning Systems (neuroscience) as an analogy, the authors propose that biological intelligence works by combining fast exemplar storage (hippocampus) with slow consolidation of abstract knowledge (neocortex), while current AI agents implement only the fast part.
  • The paper formalizes these limitations, evaluates alternative viewpoints, and ends with a coexistence proposal and calls for builders and benchmark designers in the memory community to act.

Abstract

Current agentic memory systems (vector stores, retrieval-augmented generation, scratchpads, and context-window management) do not implement memory: they implement lookup. We argue that treating lookup as memory is a category error with provable consequences for agent capability, long-term learning, and security. Retrieval generalizes by similarity to stored cases; weight-based memory generalizes by applying abstract rules to inputs never seen before. Conflating the two produces agents that accumulate notes indefinitely without developing expertise, face a provable generalization ceiling on compositionally novel tasks that no increase in context size or retrieval quality can overcome, and are structurally vulnerable to persistent memory poisoning as injected content propagates across all future sessions. Drawing on Complementary Learning Systems theory from neuroscience, we show that biological intelligence solved this problem by pairing fast hippocampal exemplar storage with slow neocortical weight consolidation, and that current AI agents implement only the first half. We formalize these limitations, address four alternative views, and close with a co-existence proposal and a call to action for system builders, benchmark designers, and the memory community.