Gated Memory Policy
arXiv cs.AI / 4/22/2026
📰 NewsIdeas & Deep AnalysisModels & Research
Key Points
- Robotic manipulation tasks can be Markovian or non-Markovian, and naively extending observation history can cause large performance drops due to distribution shift and overfitting.
- The proposed Gated Memory Policy (GMP) learns both when to recall historical context (via a learned memory gate) and what information to store and retrieve (via a lightweight cross-attention module).
- GMP improves robustness by adding diffusion noise to historical actions to reduce sensitivity to noisy or inaccurate past histories during both training and inference.
- On the non-Markovian MemMimic benchmark, GMP reports a 30.1% average success-rate improvement over long-history baselines, while still performing competitively on Markovian tasks in RoboMimic.
- The authors provide code, data, and deployment instructions via the project website.


