Experiential Reflective Learning for Self-Improving LLM Agents

arXiv cs.AI / 3/27/2026

💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research

共有:

Key Points

The paper introduces Experiential Reflective Learning (ERL), a self-improvement framework for LLM agents that adapts to specialized environments by extracting actionable lessons from past task experiences.
ERL works by reflecting on task trajectories and outcomes to generate transferable heuristics, then retrieving the most relevant heuristics at test time and injecting them into the agent’s context to guide execution.
On the Gaia2 benchmark, ERL raises success rate by 7.8% over a ReAct baseline, with the biggest improvements in task completion reliability.
The study’s ablations show that selective retrieval is crucial for performance and that using heuristics provides more transferable abstractions than few-shot trajectory prompting.
Overall, the authors argue that extracting heuristics from single-attempt experience enables effective agent self-improvement without re-learning from scratch each task.

Abstract

Recent advances in large language models (LLMs) have enabled the development of autonomous agents capable of complex reasoning and multi-step problem solving. However, these agents struggle to adapt to specialized environments and do not leverage past interactions, approaching each new task from scratch regardless of their accumulated experience. We introduce Experiential Reflective Learning (ERL), a simple self-improvement framework that enables rapid environment adaptation through experiential learning. ERL reflects on task trajectories and outcomes to generate heuristics, capturing actionable lessons that transfer across tasks. At test time, relevant heuristics are retrieved based on the current task and injected into the agent's context to guide execution. On the Gaia2 benchmark, ERL improves success rate by 7.8% over a ReAct baseline, with large gains in task completion reliability, and outperforms prior experiential learning methods. Through systematic ablations, we find that selective retrieval is essential and that heuristics provide more transferable abstractions than few-shot trajectory prompting. These results demonstrate that reflecting on single-attempt experiences to extract transferable heuristics enables effective agent self-improvement.