ExpertGen: Scalable Sim-to-Real Expert Policy Learning from Imperfect Behavior Priors
arXiv cs.RO / 4/22/2026
💬 OpinionDeveloper Stack & InfrastructureModels & Research
Key Points
- ExpertGen is a simulation-first framework for learning robust, generalizable robotics behavior cloning policies without collecting prohibitively expensive real-world expert demonstrations.
- It initializes a diffusion-policy behavior prior from imperfect demonstrations, which can be generated by large language models or collected from humans, before applying reinforcement learning to improve task success.
- The RL stage optimizes the diffusion model’s initial noise while keeping the pretrained diffusion policy frozen, constraining exploration to remain within safe, human-like behavior manifolds.
- Experiments on manipulation benchmarks show ExpertGen reaches high-quality expert policies with sparse rewards and no reward engineering, including strong performance on industrial assembly and long-horizon manipulation.
- For sim-to-real, ExpertGen state-based policies are distilled into visuomotor policies using DAgger and deployed on real robotic hardware successfully.
Related Articles
I’m working on an AGI and human council system that could make the world better and keep checks and balances in place to prevent catastrophes. It could change the world. Really. Im trying to get ahead of the game before an AGI is developed by someone who only has their best interest in mind.
Reddit r/artificial
Deepseek V4 Flash and Non-Flash Out on HuggingFace
Reddit r/LocalLLaMA

DeepSeek V4 Flash & Pro Now out on API
Reddit r/LocalLLaMA

I’m building a post-SaaS app catalog on Base, and here’s what that actually means
Dev.to

From "Hello World" to "Hello Agents": The Developer Keynote That Rewired Software Engineering
Dev.to