Drawing on Memory: Dual-Trace Encoding Improves Cross-Session Recall in LLM Agents

arXiv cs.AI / 4/15/2026

📰 NewsSignals & Early TrendsIdeas & Deep AnalysisModels & Research

共有:

Key Points

The paper argues that persistent-memory LLM agents often store information as flat facts, limiting temporal reasoning, change tracking, and cross-session aggregation.
It proposes “dual-trace encoding,” where each stored fact is paired with a concrete scene trace (a narrative reconstruction of when and under what context the information was learned) to make memories more distinctive.
Experiments on the LongMemEval-S benchmark (4,575 sessions, 100 recall questions) show dual-trace outperforms a fact-only control, achieving 73.7% vs 53.5% overall accuracy (+20.2 pp, statistically significant).
The improvement is concentrated in temporal reasoning (+40 pp), knowledge-update tracking (+25 pp), and multi-session aggregation (+30 pp), with no gain for single-session retrieval, aligning with encoding specificity theory.
Token-level analysis indicates the accuracy gains come without additional token cost, and the authors outline an approach to adapt the method to coding agents with preliminary pilot results.

Abstract

LLM agents with persistent memory store information as flat factual records, providing little context for temporal reasoning, change tracking, or cross-session aggregation. Inspired by the drawing effect [3], we introduce dual-trace memory encoding. In this method, each stored fact is paired with a concrete scene trace, a narrative reconstruction of the moment and context in which the information was learned. The agent is forced to commit to specific contextual details during encoding, creating richer, more distinctive memory traces. Using the LongMemEval-S benchmark (4,575 sessions, 100 recall questions), we compare dual-trace encoding against a fact-only control with matched coverage and format over 99 shared questions. Dual-trace achieves 73.7% overall accuracy versus 53.5%, a +20.2 percentage point (pp) gain (95% CI: [+12.1, +29.3], bootstrap p < 0.0001). Gains concentrate in temporal reasoning (+40pp), knowledge-update tracking (+25pp), and multi-session aggregation (+30pp), with no benefit for single-session retrieval, consistent with encoding specificity theory [8]. Token analysis shows dual-trace encoding achieves this gain at no additional cost. We additionally sketch an architectural design for adapting dual-trace encoding to coding agents, with preliminary pilot validation.

Black Hat Asia

AI Business

The Complete Guide to Better Meeting Productivity with AI Note-Taking

Dev.to

5 Ways Real-Time AI Can Boost Your Sales Call Performance

Dev.to

RAG in Practice — Part 4: Chunking, Retrieval, and the Decisions That Break RAG

Dev.to

Why dynamically routing multi-timescale advantages in PPO causes policy collapse (and a simple decoupled fix) [R]

Reddit r/MachineLearning

Drawing on Memory: Dual-Trace Encoding Improves Cross-Session Recall in LLM Agents

Key Points

Abstract

Related Articles

Black Hat Asia

The Complete Guide to Better Meeting Productivity with AI Note-Taking

5 Ways Real-Time AI Can Boost Your Sales Call Performance

RAG in Practice — Part 4: Chunking, Retrieval, and the Decisions That Break RAG

Why dynamically routing multi-timescale advantages in PPO causes policy collapse (and a simple decoupled fix) [R]

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer