MaskAdapt: Learning Flexible Motion Adaptation via Mask-Invariant Prior for Physics-Based Characters

arXiv cs.CV / 4/1/2026

📰 NewsIdeas & Deep AnalysisModels & Research

Key Points

  • MaskAdapt is a two-stage framework for flexible motion adaptation in physics-based humanoid control using a mask-invariant motion prior and a residual policy for targeted updates.
  • The base policy is trained with stochastic body-part masking plus regularization to keep action distributions consistent despite missing observations, improving stability under partial observability.
  • A second-stage residual policy is trained on top of a frozen base controller to change only the selected body parts while preserving behaviors in unmodified regions.
  • The paper demonstrates versatility via motion composition (mask-controlled multi-part adaptation within one sequence) and text-driven partial goal tracking using kinematic targets derived from a pre-trained text-conditioned motion generator.
  • Experiments indicate MaskAdapt achieves stronger robustness and more effective targeted motion adaptation than prior approaches.

Abstract

We present MaskAdapt, a framework for flexible motion adaptation in physics-based humanoid control. The framework follows a two-stage residual learning paradigm. In the first stage, we train a mask-invariant base policy using stochastic body-part masking and a regularization term that enforces consistent action distributions across masking conditions. This yields a robust motion prior that remains stable under missing observations, anticipating later adaptation in those regions. In the second stage, a residual policy is trained atop the frozen base controller to modify only the targeted body parts while preserving the original behaviors elsewhere. We demonstrate the versatility of this design through two applications: (i) motion composition, where varying masks enable multi-part adaptation within a single sequence, and (ii) text-driven partial goal tracking, where designated body parts follow kinematic targets provided by a pre-trained text-conditioned autoregressive motion generator. Through experiments, MaskAdapt demonstrates strong robustness and adaptability, producing diverse behaviors under masked observations and delivering superior targeted motion adaptation compared to prior work.