ARROW: Augmented Replay for RObust World models
arXiv cs.LG / 3/13/2026
📰 NewsModels & Research
Key Points
- ARROW is a model-based continual reinforcement learning algorithm that extends DreamerV3 with a memory-efficient, distribution-matching replay buffer to mitigate catastrophic forgetting.
- It uses two complementary buffers: a short-term buffer for recent experiences and a long-term buffer that preserves task diversity through intelligent sampling.
- Evaluation on Atari (tasks without shared structure) and Procgen CoinRun variants (tasks with shared structure) shows ARROW reduces forgetting compared to baselines with the same replay buffer size, while maintaining forward transfer.
- The approach draws inspiration from neuroscience, where the brain replays experiences to a predictive world model rather than directly to the policy.
- The results highlight the potential of model-based RL with bio-inspired replay for continual learning and warrant further research.
Related Articles

I made a 'benchmark' where LLMs write code controlling units in a 1v1 RTS game.
Dev.to

My AI Does Not Have a Clock
Dev.to
How to settle on a coding LLM ? What parameters to watch out for ?
Reddit r/LocalLLaMA

Andrej Karpathy's autonomous AI research agent ran 700 experiments in 2 days and gave a glimpse of where AI is heading
Reddit r/artificial

So cursor admits that Kimi K2.5 is the best open source model
Reddit r/LocalLLaMA