SLEA-RL: Step-Level Experience Augmented Reinforcement Learning for Multi-Turn Agentic Training
arXiv cs.LG / 3/20/2026
📰 NewsModels & Research
Key Points
- SLEA-RL introduces step-level experience augmentation for multi-turn LLM agents by retrieving experiences at each decision step conditioned on the current observation.
- It unpacks three components: step-level observation clustering for efficient, structure-preserving retrieval; a self-evolving experience library that uses score-based admission and rate-limited extraction to distill strategies and failure patterns; and policy optimization with step-level credit assignment for fine-grained advantage estimation across episodes.
- The library evolves alongside the policy via semantic analysis rather than gradient updates, enabling continual adaptation without direct gradient updates to stored experiences.
- Experiments on long-horizon multi-turn benchmarks show SLEA-RL achieves superior performance over various RL baselines.
Related Articles
When AI Grows Up: Identity, Memory, and What Persists Across Versions
Dev.to
OpenAI is throwing everything into building a fully automated researcher
MIT Technology Review
Kimi just published a paper replacing residual connections in transformers. results look legit
Reddit r/LocalLLaMA
機械学習の最適化対象まとめ(E資格対策にも)
Qiita

14 Best Self-Hosted Claude Alternatives for AI and Coding in 2026
Dev.to