Lifelong Imitation Learning with Multimodal Latent Replay and Incremental Adjustment

arXiv cs.CV / 3/12/2026

📰 NewsIdeas & Deep AnalysisModels & Research

共有:

Key Points

It introduces a lifelong imitation learning framework that enables continual policy refinement across sequential tasks under realistic memory and data constraints.
Unlike conventional experience replay, the method operates entirely in a multimodal latent space that stores compact representations of visual, linguistic, and robot state information to support future learning.
It adds an incremental feature adjustment mechanism with an angular margin constraint to stabilize adaptation and preserve inter-task distinctiveness of task embeddings.
The approach establishes a new state of the art on LIBERO benchmarks, reporting 10-17 point gains in AUC and up to 65% less forgetting compared to previous methods, with ablation studies confirming component effectiveness.
The authors release the code at the provided GitHub link.

Abstract

We introduce a lifelong imitation learning framework that enables continual policy refinement across sequential tasks under realistic memory and data constraints. Our approach departs from conventional experience replay by operating entirely in a multimodal latent space, where compact representations of visual, linguistic, and robot's state information are stored and reused to support future learning. To further stabilize adaptation, we introduce an incremental feature adjustment mechanism that regularizes the evolution of task embeddings through an angular margin constraint, preserving inter-task distinctiveness. Our method establishes a new state of the art in the LIBERO benchmarks, achieving 10-17 point gains in AUC and up to 65% less forgetting compared to previous leading methods. Ablation studies confirm the effectiveness of each component, showing consistent gains over alternative strategies. The code is available at: https://github.com/yfqi/lifelong_mlr_ifa.