Robust Embodied Perception in Dynamic Environments via Disentangled Weight Fusion

arXiv cs.CV / 4/3/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

Key Points

  • The paper introduces an exemplar-free and domain-id-free incremental learning framework for embodied perception systems that must adapt to dynamic, distribution-drifting physical environments.
  • It proposes a disentangled representation mechanism to suppress non-essential environmental style interference, helping the model focus on shared semantic features across scenes.
  • To improve continual adaptation without saving past data, it uses a weight fusion strategy that combines old and new environment knowledge in parameter space while reducing catastrophic forgetting.
  • Experiments on multiple benchmark datasets report significant reductions in catastrophic forgetting and improved accuracy over existing state-of-the-art approaches under the fully domain-id-free and exemplar-free setting.

Abstract

Embodied perception systems face severe challenges of dynamic environment distribution drift when they continuously interact in open physical spaces. However, the existing domain incremental awareness methods often rely on the domain id obtained in advance during the testing phase, which limits their practicability in unknown interaction scenarios. At the same time, the model often overfits to the context-specific perceptual noise, which leads to insufficient generalization ability and catastrophic forgetting. To address these limitations, we propose a domain-id and exemplar-free incremental learning framework for embodied multimedia systems, which aims to achieve robust continuous environment adaptation. This method designs a disentangled representation mechanism to remove non-essential environmental style interference, and guide the model to focus on extracting semantic intrinsic features shared across scenes, thereby eliminating perceptual uncertainty and improving generalization. We further use the weight fusion strategy to dynamically integrate the old and new environment knowledge in the parameter space, so as to ensure that the model adapts to the new distribution without storing historical data and maximally retains the discrimination ability of the old environment. Extensive experiments on multiple standard benchmark datasets show that the proposed method significantly reduces catastrophic forgetting in a completely exemplar-free and domain-id free setting, and its accuracy is better than the existing state-of-the-art methods.

Robust Embodied Perception in Dynamic Environments via Disentangled Weight Fusion | AI Navigate