Trajectory-Informed Memory Generation for Self-Improving Agent Systems

arXiv cs.AI / 3/12/2026

📰 NewsIdeas & Deep AnalysisModels & Research

共有:

Key Points

It presents a four-component framework (Trajectory Intelligence Extractor, Decision Attribution Analyzer, Contextual Learning Generator, Adaptive Memory Retrieval System) to extract actionable learnings from agent execution trajectories and apply them to future tasks.
It emphasizes structured learnings with provenance and contextual, task-specific retrieval rather than storing generic conversational facts.
The Contextual Learning Generator offers three tip types—strategy tips from successful patterns, recovery tips from failure handling, and optimization tips from inefficient yet salvageable executions—fed into prompts by the Adaptive Memory Retrieval System based on multi-dimensional similarity.
Evaluation on the AppWorld benchmark shows significant improvements, including up to 14.3 percentage-point gains in scenario goal completion and 28.5 percentage-point gains on complex tasks (roughly a 149% relative increase).

Abstract

LLM-powered agents face a persistent challenge: learning from their execution experiences to improve future performance. While agents can successfully complete many tasks, they often repeat inefficient patterns, fail to recover from similar errors, and miss opportunities to apply successful strategies from past executions. We present a novel framework for automatically extracting actionable learnings from agent execution trajectories and utilizing them to improve future performance through contextual memory retrieval. Our approach comprises four components: (1) a Trajectory Intelligence Extractor that performs semantic analysis of agent reasoning patterns, (2) a Decision Attribution Analyzer that identifies which decisions and reasoning steps led to failures, recoveries, or inefficiencies, (3) a Contextual Learning Generator that produces three types of guidance -- strategy tips from successful patterns, recovery tips from failure handling, and optimization tips from inefficient but successful executions, and (4) an Adaptive Memory Retrieval System that injects relevant learnings into agent prompts based on multi-dimensional similarity. Unlike existing memory systems that store generic conversational facts, our framework understands execution patterns, extracts structured learnings with provenance, and retrieves guidance tailored to specific task contexts. Evaluation on the AppWorld benchmark demonstrates consistent improvements, with up to 14.3 percentage point gains in scenario goal completion on held-out tasks and particularly strong benefits on complex tasks (28.5~pp scenario goal improvement, a 149\% relative increase).