ROPA: Synthetic Robot Pose Generation for RGB-D Bimanual Data Augmentation
arXiv cs.RO / 4/6/2026
💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research
Key Points
- The paper introduces ROPA, an offline imitation-learning data augmentation method that fine-tunes Stable Diffusion to synthesize third-person (eye-to-hand) RGB and RGB-D observations at novel robot poses for bimanual manipulation.
- ROPA also generates matching joint-space action labels and uses constrained optimization to keep the synthesized robot–object interaction physically consistent via gripper-to-object contact constraints.
- Experiments on 5 simulation and 3 real-world bimanual tasks (2625 simulated and 300 real-world trials) show ROPA outperforms baseline and ablation methods.
- The work targets a key scalability gap: improving pose/scene coverage for RGB-D imitation learning without the expensive process of collecting diverse, precise real demonstrations.
- A project website is provided to share code and resources for the proposed augmentation approach.
Related Articles

Black Hat Asia
AI Business

How Bash Command Safety Analysis Works in AI Systems
Dev.to

How I Built an AI Agent That Earns USDC While I Sleep — A Complete Guide
Dev.to

How to Get Better Output from AI Tools (Without Burning Time and Tokens)
Dev.to

How I Added LangChain4j Without Letting It Take Over My Spring Boot App
Dev.to