A Principled Approach for Creating High-fidelity Synthetic Demonstrations for Imitation Learning

arXiv cs.RO / 5/5/2026

📰 NewsDeveloper Stack & InfrastructureIdeas & Deep AnalysisModels & Research

Key Points

  • The paper proposes a principled synthetic-demonstration method for imitation learning that preserves the expert trajectory rather than generating new motion with sampling planners or trajectory optimization that can drift away from the demonstrated path.
  • It treats the expert trajectory as a strong prior by modeling it with Dynamic Movement Primitives (DMPs) and retargeting the motion to new goals, object configurations, and viewpoints while maintaining phase consistency and shape structure within a reconstructed 3DGS scene.
  • To handle clutter and collisions safely, the authors introduce an analytic obstacle-aware DMP formulation that directly leverages the continuous density field produced by 3D Gaussian Splatting, enabling collision avoidance with minimal perturbation of the nominal expert motion.
  • Experiments on a Spot mobile manipulator across three tasks with increasing sensitivity show that the proposed approach reduces trajectory deviation and collision rates and improves task success, particularly for diffusion-based visuomotor policy training.

Abstract

Recent advances in 3D Gaussian Splatting (3DGS) have enabled visually realistic demonstration generation from a single expert trajectory and a short multi-view scan. However, existing 3DGS-based synthesis pipelines typically generate new motions using sampling-based planners or trajectory optimization, which often deviate substantially from the expert's demonstrated path. While such deviations may be acceptable for tasks insensitive to motion shape, they discard subtle spatial and temporal structure that is critical for contact-rich and shape-sensitive manipulation, causing increased demonstration diversity to harm downstream policy learning. We argue that demonstration synthesis should treat the expert trajectory as a strong prior. Building on this principle, we propose a framework that synthesizes diverse task demonstrations while explicitly preserving expert motion structure. We model the expert trajectory using Dynamic Movement Primitives (DMPs) and retarget it to new goals, object configurations, and viewpoints within a reconstructed 3DGS scene, yielding phase-consistent, shape-preserving motion by construction. To safely realize this expert-preserving diversity in cluttered scenes, we introduce an analytic obstacle-aware DMP formulation that operates directly on the continuous density field induced by the 3DGS representation. This enables collision avoidance while minimally perturbing the nominal expert motion, unifying photorealistic rendering and geometric reasoning without additional scene representations. We evaluate our approach on a Spot mobile manipulator across three manipulation tasks with increasing sensitivity to trajectory fidelity. Compared to planner- and optimization-based synthesis, our method produces trajectories with lower deviation and collision rates and yields higher task success when training diffusion-based visuomotor policies.