DockAnywhere: Data-Efficient Visuomotor Policy Learning for Mobile Manipulation via Novel Demonstration Generation
arXiv cs.RO / 4/17/2026
📰 NewsSignals & Early TrendsIdeas & Deep AnalysisModels & Research
Key Points
- DockAnywhere introduces a data-efficient demonstration generation framework to improve viewpoint generalization for mobile manipulation when docking points vary in real environments.
- It lifts a single demonstration into many feasible docking configurations by separating docking-dependent base motions from contact-centric manipulation skills that are invariant across viewpoints.
- The method samples feasible docking proposals under feasibility constraints and generates corresponding trajectories using structure-preserving augmentation.
- It synthesizes consistent visual observations across viewpoints by using 3D point-cloud representations and point-level spatial editing to align observations with actions.
- Experiments on ManiSkill and real-world platforms show substantially higher policy success rates and strong generalization to novel viewpoints from docking points not seen during training.



