MemRouter: Memory-as-Embedding Routing for Long-Term Conversational Agents
arXiv cs.AI / 5/4/2026
💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research
Key Points
- The paper introduces MemRouter, a “write-side” memory routing approach for long-term conversational agents that decides which turns to store without doing autoregressive memory-management generation every turn.
- MemRouter uses an embedding-based routing policy: it encodes each turn with recent context, projects embeddings through a frozen LLM backbone, and uses lightweight classification heads trained with only 12M parameters to predict whether to admit the turn to external memory.
- In matched-harness experiments on LoCoMo with the retrieval pipeline, prompts, and Q&A backbone (Qwen2.5-7B) held constant, MemRouter improves overall F1 to 52.0 from 45.6 versus an LLM-based memory manager, with results reported as statistically non-overlapping at 95% confidence intervals.
- MemRouter also substantially reduces memory-management latency, cutting p50 from 970ms to 58ms, while additional ablations show that learned admission provides the largest gains, followed by category-specific prompting and retrieval.
- The work supports a modular design for long-horizon conversational QA, suggesting that memory admission can be optimized via a small supervised router while answer generation stays as a separate downstream component.
Related Articles
AnnouncementsBuilding a new enterprise AI services company with Blackstone, Hellman & Friedman, and Goldman Sachs
Anthropic News

Dara Khosrowshahi on replacing Uber drivers — and himself — with AI
The Verge

CLMA Frame Test
Dev.to

You Are Right — You Don't Need CLAUDE.md
Dev.to

Governance and Liability in AI Agents: What I Built Trying to Answer Those Questions
Dev.to