PlayWorld: Learning Robot World Models from Autonomous Play

arXiv cs.AI / 3/11/2026

Tools & Practical UsageModels & Research

Read original →

共有:

Key Points

PlayWorld is a fully autonomous pipeline that trains high-fidelity video world simulators using robot self-play data, avoiding reliance on human demonstrations.
The system captures complex, physically consistent robot-object interactions crucial for realistic manipulation, outperforming models trained on human-collected datasets.
PlayWorld improves the accuracy of failure prediction and policy evaluation by up to 40%, demonstrating higher quality predictions for contact-rich tasks.
The framework enhances reinforcement learning in the world model, boosting real-world manipulation success rates by 65% compared to prior approaches.
This approach enables scalable data collection and advances robot simulators towards more generalizable and robust manipulation capabilities.

Computer Science > Robotics

arXiv:2603.09030 (cs)

[Submitted on 9 Mar 2026]

Title:PlayWorld: Learning Robot World Models from Autonomous Play

Authors:Tenny Yin, Zhiting Mei, Zhonghe Zheng, Miyu Yamane, David Wang, Jade Sceats, Samuel M. Bateman, Lihan Zha, Apurva Badithela, Ola Shorinwa, Anirudha Majumdar

View a PDF of the paper titled PlayWorld: Learning Robot World Models from Autonomous Play, by Tenny Yin and 10 other authors

View PDF HTML (experimental)

Abstract:Action-conditioned video models offer a promising path to building general-purpose robot simulators that can improve directly from data. Yet, despite training on large-scale robot datasets, current state-of-the-art video models still struggle to predict physically consistent robot-object interactions that are crucial in robotic manipulation. To close this gap, we present PlayWorld, a simple, scalable, and fully autonomous pipeline for training high-fidelity video world simulators from interaction experience. In contrast to prior approaches that rely on success-biased human demonstrations, PlayWorld is the first system capable of learning entirely from unsupervised robot self-play, enabling naturally scalable data collection while capturing complex, long-tailed physical interactions essential for modeling realistic object dynamics. Experiments across diverse manipulation tasks show that PlayWorld generates high-quality, physically consistent predictions for contact-rich interactions that are not captured by world models trained on human-collected this http URL further demonstrate the versatility of PlayWorld in enabling fine-grained failure prediction and policy evaluation, with up to 40% improvements over human-collected data. Finally, we demonstrate how PlayWorld enables reinforcement learning in the world model, improving policy performance by 65% in success rates when deployed in the real world.

Comments:
Subjects:	Robotics (cs.RO); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2603.09030 [cs.RO]
	(or arXiv:2603.09030v1 [cs.RO] for this version)
	https://doi.org/10.48550/arXiv.2603.09030 Focus to learn more arXiv-issued DOI via DataCite

Submission history

From: Yijun Yin [view email]
[v1] Mon, 9 Mar 2026 23:58:07 UTC (43,388 KB)

Full-text links:

Access Paper:

View a PDF of the paper titled PlayWorld: Learning Robot World Models from Autonomous Play, by Tenny Yin and 10 other authors

View PDF
HTML (experimental)
TeX Source

view license

Current browse context:

cs.RO

< prev | next >

new | recent | 2026-03

Change to browse by:

cs
cs.AI

References & Citations

export BibTeX citation Loading...

BibTeX formatted citation

Data provided by:

Bookmark

Bibliographic Tools

Bibliographic and Citation Tools

Bibliographic Explorer Toggle

Bibliographic Explorer (What is the Explorer?)

Connected Papers Toggle

Connected Papers (What is Connected Papers?)

Litmaps Toggle

Litmaps (What is Litmaps?)

scite.ai Toggle

scite Smart Citations (What are Smart Citations?)

Code, Data, Media

Code, Data and Media Associated with this Article

alphaXiv Toggle

alphaXiv (What is alphaXiv?)

Links to Code Toggle

CatalyzeX Code Finder for Papers (What is CatalyzeX?)

DagsHub Toggle

DagsHub (What is DagsHub?)

GotitPub Toggle

Gotit.pub (What is GotitPub?)

Huggingface Toggle

Hugging Face (What is Huggingface?)

Links to Code Toggle

Papers with Code (What is Papers with Code?)

ScienceCast Toggle

ScienceCast (What is ScienceCast?)

Demos

Replicate Toggle

Replicate (What is Replicate?)

Spaces Toggle

Hugging Face Spaces (What is Spaces?)

Spaces Toggle

TXYZ.AI (What is TXYZ.AI?)

Recommenders and Search Tools

Link to Influence Flower

Influence Flower (What are Influence Flowers?)

Core recommender toggle

CORE Recommender (What is CORE?)

Author
Venue
Institution
Topic

About arXivLabs

arXivLabs: experimental projects with community collaborators

arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.

Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them.

Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs.

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

1Password、AIエージェントのアクセス制御を統合管理する「Unified Access」発表人間・マシン・AIの資格情報を一元統制のサムネイル画像

Ledge.ai

『モンドーモンドー』｜夏目龍頭流闇文学｜AI画像生成｜自由詩｜散文詩｜ホラー｜ダークファンタジー｜深淵図書館

note

報告：LLMにおける「自己言及的再帰」と「ステートフル・エミュレーション」の観測

note

「お金、見直したいけどどこから？」AIが改善ヒントを教えてくれる、公式プロンプトを公開

note

Copilotと物語を作ってみた #213 めーっちゃボロボロこぼす女の子の物語

note

PlayWorld: Learning Robot World Models from Autonomous Play

Key Points

Computer Science > Robotics

Title:PlayWorld: Learning Robot World Models from Autonomous Play

Submission history