AI Navigate

PlayWorld: Learning Robot World Models from Autonomous Play

arXiv cs.AI / 3/11/2026

Tools & Practical UsageModels & Research

Key Points

  • PlayWorld is a fully autonomous pipeline that trains high-fidelity video world simulators using robot self-play data, avoiding reliance on human demonstrations.
  • The system captures complex, physically consistent robot-object interactions crucial for realistic manipulation, outperforming models trained on human-collected datasets.
  • PlayWorld improves the accuracy of failure prediction and policy evaluation by up to 40%, demonstrating higher quality predictions for contact-rich tasks.
  • The framework enhances reinforcement learning in the world model, boosting real-world manipulation success rates by 65% compared to prior approaches.
  • This approach enables scalable data collection and advances robot simulators towards more generalizable and robust manipulation capabilities.

Computer Science > Robotics

arXiv:2603.09030 (cs)
[Submitted on 9 Mar 2026]

Title:PlayWorld: Learning Robot World Models from Autonomous Play

View a PDF of the paper titled PlayWorld: Learning Robot World Models from Autonomous Play, by Tenny Yin and 10 other authors
View PDF HTML (experimental)
Abstract:Action-conditioned video models offer a promising path to building general-purpose robot simulators that can improve directly from data. Yet, despite training on large-scale robot datasets, current state-of-the-art video models still struggle to predict physically consistent robot-object interactions that are crucial in robotic manipulation. To close this gap, we present PlayWorld, a simple, scalable, and fully autonomous pipeline for training high-fidelity video world simulators from interaction experience. In contrast to prior approaches that rely on success-biased human demonstrations, PlayWorld is the first system capable of learning entirely from unsupervised robot self-play, enabling naturally scalable data collection while capturing complex, long-tailed physical interactions essential for modeling realistic object dynamics. Experiments across diverse manipulation tasks show that PlayWorld generates high-quality, physically consistent predictions for contact-rich interactions that are not captured by world models trained on human-collected this http URL further demonstrate the versatility of PlayWorld in enabling fine-grained failure prediction and policy evaluation, with up to 40% improvements over human-collected data. Finally, we demonstrate how PlayWorld enables reinforcement learning in the world model, improving policy performance by 65% in success rates when deployed in the real world.
Comments:
Subjects: Robotics (cs.RO); Artificial Intelligence (cs.AI)
Cite as: arXiv:2603.09030 [cs.RO]
  (or arXiv:2603.09030v1 [cs.RO] for this version)
  https://doi.org/10.48550/arXiv.2603.09030
Focus to learn more
arXiv-issued DOI via DataCite

Submission history

From: Yijun Yin [view email]
[v1] Mon, 9 Mar 2026 23:58:07 UTC (43,388 KB)
Full-text links:

Access Paper:

Current browse context:
cs.RO
< prev   |   next >
Change to browse by:

References & Citations

export BibTeX citation Loading...

BibTeX formatted citation

×
Data provided by:

Bookmark

BibSonomy logo Reddit logo
Bibliographic Tools

Bibliographic and Citation Tools

Bibliographic Explorer Toggle
Bibliographic Explorer (What is the Explorer?)
Connected Papers Toggle
Connected Papers (What is Connected Papers?)
Litmaps Toggle
Litmaps (What is Litmaps?)
scite.ai Toggle
scite Smart Citations (What are Smart Citations?)
Code, Data, Media

Code, Data and Media Associated with this Article

alphaXiv Toggle
alphaXiv (What is alphaXiv?)
Links to Code Toggle
CatalyzeX Code Finder for Papers (What is CatalyzeX?)
DagsHub Toggle
DagsHub (What is DagsHub?)
GotitPub Toggle
Gotit.pub (What is GotitPub?)
Huggingface Toggle
Hugging Face (What is Huggingface?)
Links to Code Toggle
Papers with Code (What is Papers with Code?)
ScienceCast Toggle
ScienceCast (What is ScienceCast?)
Demos

Demos

Replicate Toggle
Replicate (What is Replicate?)
Spaces Toggle
Hugging Face Spaces (What is Spaces?)
Spaces Toggle
TXYZ.AI (What is TXYZ.AI?)
Related Papers

Recommenders and Search Tools

Link to Influence Flower
Influence Flower (What are Influence Flowers?)
Core recommender toggle
CORE Recommender (What is CORE?)
About arXivLabs

arXivLabs: experimental projects with community collaborators

arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.

Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them.

Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs.