Generalizable Self-Evolving Memory for Automatic Prompt Optimization

arXiv cs.CL / 3/24/2026

💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper introduces MemAPO, which reframes automatic prompt optimization as a process that can generalize across different queries rather than fitting a single fixed prompt to one task.
  • MemAPO uses a dual-memory system: it stores reusable strategy templates distilled from successful reasoning trajectories and structured error patterns capturing recurring failure modes.
  • For a new prompt, the framework retrieves both relevant strategies and known failure patterns to compose an improved prompt that encourages effective reasoning while avoiding past mistakes.
  • Through iterative self-reflection and memory editing, MemAPO continuously updates its memory so optimization performance can improve over time without restarting from scratch per task.
  • Experiments on multiple benchmarks report consistent gains over baseline prompt-optimization methods and lower optimization cost.

Abstract

Automatic prompt optimization is a promising approach for adapting large language models (LLMs) to downstream tasks, yet existing methods typically search for a specific prompt specialized to a fixed task. This paradigm limits generalization across heterogeneous queries and prevents models from accumulating reusable prompting knowledge over time. In this paper, we propose MemAPO, a memory-driven framework that reconceptualizes prompt optimization as generalizable and self-evolving experience accumulation. MemAPO maintains a dual-memory mechanism that distills successful reasoning trajectories into reusable strategy templates while organizing incorrect generations into structured error patterns that capture recurrent failure modes. Given a new prompt, the framework retrieves both relevant strategies and failure patterns to compose prompts that promote effective reasoning while discouraging known mistakes. Through iterative self-reflection and memory editing, MemAPO continuously updates its memory, enabling prompt optimization to improve over time rather than restarting from scratch for each task. Experiments on diverse benchmarks show that MemAPO consistently outperforms representative prompt optimization baselines while substantially reducing optimization cost.