SLEA-RL: Step-Level Experience Augmented Reinforcement Learning for Multi-Turn Agentic Training
arXiv cs.LG / 3/20/2026
📰 NewsModels & Research
Key Points
- SLEA-RL introduces step-level experience augmentation for multi-turn LLM agents by retrieving experiences at each decision step conditioned on the current observation.
- It unpacks three components: step-level observation clustering for efficient, structure-preserving retrieval; a self-evolving experience library that uses score-based admission and rate-limited extraction to distill strategies and failure patterns; and policy optimization with step-level credit assignment for fine-grained advantage estimation across episodes.
- The library evolves alongside the policy via semantic analysis rather than gradient updates, enabling continual adaptation without direct gradient updates to stored experiences.
- Experiments on long-horizon multi-turn benchmarks show SLEA-RL achieves superior performance over various RL baselines.
Related Articles

Interactive Web Visualization of GPT-2
Reddit r/artificial
[R] Causal self-attention as a probabilistic model over embeddings
Reddit r/MachineLearning
The 5 software development trends that actually matter in 2026 (and what they mean for your startup)
Dev.to
iPhone 17 Pro Running a 400B LLM: What It Really Means
Dev.to
[R] V-JEPA 2 has no pixel decoder, so how do you inspect what it learned? We attached a VQ probe to the frozen encoder and found statistically significant physical structure
Reddit r/artificial