AI Navigate

Metacog: Proprioception, Not Yet Another Memory MCP: A Different Approach to Cross-Session Learning Reinforcement in AI Agents

Reddit r/artificial / 3/21/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

Key Points

  • The post argues that current AI coding agents rely on memory plugins that perform passive retrieval, calling this the Passive Librarian Problem, because the agent must know what it forgot to trigger a memory pull.
  • It introduces a 'nervous system' approach using real-time proprioceptive signals and a reinforcement-tracking model that rewards correct working rather than punishing failures, replacing passive recall.
  • It roots the idea in the Extended Mind theory (Clark & Chalmers), treating hooks, state buffers, and reinforcement logs as extensions of the agent's cognition rather than external tools.
  • It cites Experiential Reinforcement Learning results (Zhao et al., 2025) showing reflections on failure trajectories during training can boost task success by up to 81% compared to standard prompting.
  • It demonstrates a minimal implementation: two Claude Code hooks with zero dependencies, aiming to enable cross-session learning without traditional memory storage.

TL;DR: Everyone's building memory plugins for AI coding agents. I'm not sure that stale, past memory of tasks executed is the right way forward for this application. Intelligence has metacognition, the ability to think about how you're thinking.

Source (or read on): github.com/houtini-ai/metacog

So, I built a nervous system instead. Two Claude Code hooks, zero dependencies. The key insight: treating the agent's context window like a filing cabinet doesn't work, because the agent has to know what it forgot in order to ask for it. I replaced passive recall with real-time proprioceptive signals and a reinforcement tracking model that rewards rules for working rather than punishing them for not failing.

The Problem with Agent Memory

The current wave of memory solutions for AI coding agents (Claude-Mem, Memsearch, Agent Memory MCP, Cognee, SuperMemory) all follow the same architecture: capture session data, compress it, store it in SQLite or a vector store, retrieve relevant fragments on the next session, inject them into the context window.

This is the Passive Librarian Problem. The memory system waits for the agent to decide to search, pulls text, and injects it. But the agent has to know what it forgot in order to query for it. That's a paradox. And empirically, the agent reads the retrieved memories, acknowledges them, and walks into the same failure three tool calls later.

This isn't a retrieval quality issue. It's an architectural one. Memory plugins treat the context window like a filing cabinet. But cognition - even in LLM agents - doesn't work that way.

Theoretical Foundation

The Extended Mind Thesis

Clark and Chalmers (1998) argued that cognition doesn't happen exclusively inside the brain - it happens in the loop between a cognitive system and its environment. A notebook isn't just storage; when tightly coupled with a cognitive process, it becomes part of the cognitive system itself.

Paper: Clark, A. & Chalmers, D. (1998). "The Extended Mind." Analysis, 58(1), 7–19. doi:10.1093/analys/58.1.7

Applied to LLM agents: the hooks, the state buffer, the reinforcement log - these aren't external tools the agent consults. They're extensions of the agent's cognitive process, firing in the loop between action and observation. The agent doesn't "decide to check" its proprioception any more than you decide to check your sense of balance.

Experiential Reinforcement Learning

Zhao et al. (2025) demonstrated that agents which reflect on their own failure trajectories at training time improve task success by up to 81% compared to agents with standard prompting. The mechanism: structured self-reflection on what went wrong and why, not just replay of what happened.

Paper: Zhao et al. (2025). "Experiential Co-Learning of Software-Developing Agents." arXiv:2312.17025

I took this insight and moved it from training time to runtime. But naive implementation hit a critical problem (see: The Seesaw Problem below).

Metacognitive Monitoring in LLM Agents

Recent work on metacognition for LLMs distinguishes between monitoring (assessing one's own cognitive state) and control (adjusting behaviour based on that assessment). Most agent frameworks implement neither.

Paper: Weng et al. (2024). "Metacognitive Monitoring and Control in Large Language Model Agents." arXiv:2407.16867

Paper: Xu et al. (2024). "CLMC for LLM Agents: Bridging the Gap Between Cognitive Models and Agent Architectures." arXiv:2406.10155

Our approach implements both. The proprioceptive layer is monitoring. The nociceptive layer is control. Neither requires the agent to "decide" to be metacognitive - it happens automatically in the hook execution path.

Architecture: Two Hooks, Three Layers

Layer 1: Proprioception (PostToolUse hook, always-on)

Five sensors fire after every tool call. When values are within baseline, they produce zero output and cost zero tokens. When something deviates, a short signal gets injected via stderr into the agent's context. Not a command - just awareness.

Sense What it detects
O2 Token velocity - context is being consumed unsustainably
Chronos Wall-clock time and step count since last user interaction
Nociception Consecutive similar errors - the agent is stuck but hasn't recognised it
Spatial Blast radius - the modified file is imported by N other files
Vestibular Action diversity - the agent is repeating the same actions without triggering errors

This is inspired by biological proprioception - the sense that tells you where your body is in space without looking. Agents have no equivalent. They can't see their own context filling up, can't feel time passing, can't detect that they're going in circles.

Layer 2: Nociception (escalating intervention)

When Layer 1 thresholds go critical (e.g., 4+ consecutive similar errors), the system escalates:

  1. Socratic - "State the assumption you're operating on. What would falsify it?"
  2. Directive - explicit instructions to change approach
  3. User flag - tells the agent to stop and check in with the human

This is the pain response. It's designed to be disruptive. If the agent has hit four similar errors in a row, politeness isn't productive.

Layer 3: Reinforcement Tracking (UserPromptSubmit hook, cross-session)

This is where the approach fundamentally diverges from memory.

The Seesaw Problem

When we first implemented cross-session learning, we used standard time-decay for rule confidence. Pattern fires > create rule > inject rule next session > rule prevents failure > no detections > confidence decays > rule pruned > failure returns > rule recreated > confidence climbs > rule prevents failure > decays > purged > ...

The better the rule works, the faster the system kills it. That's not learning. That's an oscillation.

This isn't a tuning problem. Any time-decay model that reduces confidence based on absence of the triggering event will punish successful prevention. The fundamental assumption - "no recent activity means irrelevant" - is wrong when the lack of activity is caused by the rule itself.

Reinforcement Tracking: Inverting the Decay Model

Our solution: treat the absence of failure as evidence of effectiveness.

When the nervous system detects a failure pattern during a session, it records a detection - the failure happened. But when a known pattern doesn't fire during a session where its rule was active, the system records a suppression - the rule was present and the failure was absent.

Both count as evidence. Both increase confidence.

``` Session starts > compile digest (global + project-scoped learnings) > inject as system-reminder > write marker: which pattern IDs are active this session

Session runs > PostToolUse hook fires after every tool call > rolling 20-item action window > proprioceptive signals when abnormal > no learning happens here (pure monitoring)

Next session > read previous session's active patterns marker > run detectors against previous session state > pattern fired? > emit DETECTION (failure happened) > pattern silent + was active? > emit SUPPRESSION (rule worked) > persist both to JSONL log ```

Only truly dormant rules - patterns with zero activity (no detections and no suppressions) for 60+ days - decay. And even then, slowly. Pruning happens at 120 days for low-evidence rules.

Per-Project Scoping

Learnings live at two levels: - Global (~/.claude/metacog-learnings.jsonl) - patterns that generalise across projects - Project (<project>/.claude/metacog-learnings.jsonl) - patterns specific to one codebase

At compilation time, both merge. Project-scoped entries take precedence. A pattern that only manifests in one repo builds evidence specifically for that repo, without contaminating the global set.

How This Differs from Memory

Dimension Memory Plugins Metacog
Trigger Agent queries for relevant memories Automatic - fires on every tool call
Content What happened (activity logs) What went wrong and what prevents it
Retrieval Agent must know what to search for No retrieval - signals are pushed
Token cost Always (injected memories consume tokens) Zero when normal (signals only on deviation)
Cross-session Replay of past events Confidence-weighted behavioural rules
Decay model Time-based (punishes success) Reinforcement-based (rewards success)
Scope Generic (same for all projects) Project-scoped (learns per-codebase patterns)

Memory plugins answer: "what did the agent do before?" Metacog answers: "what's going wrong right now, and what's worked to prevent it?"

Related Work

  • Process-state buffers - the idea that agents should maintain awareness of their operational state, not just task state. Our proprioceptive layer implements this directly. See: Sumers et al. (2024). "Cognitive Architectures for Language Agents." arXiv:2309.02427

  • Reflexion - Shinn et al. (2023) showed that self-reflection on failure trajectories improves agent performance. Our reinforcement tracking extends this by tracking prevention (suppressions), not just occurrence (detections). arXiv:2303.11366

  • Voyager - Wang et al. (2023) built a skill library for Minecraft agents that grows over time. Our approach is complementary but inverted: we track failure prevention rules, not success recipes. arXiv:2305.16291

  • Generative Agents - Park et al. (2023) implemented memory retrieval with recency, importance, and relevance scoring. Still fundamentally passive - the agent must decide to retrieve. arXiv:2304.03442

Implementation

Two Claude Code hooks: ~400 lines of JavaScript.

bash npx @houtini/metacog --install

The hooks install into ~/.claude/settings.json (global) or .claude/settings.json (per-project with --project). Metacog runs silently - you only see output when something is abnormal.

Source: github.com/houtini-ai/metacog

submitted by /u/richardbaxter
[link] [comments]