World Model for Robot Learning: A Comprehensive Survey

arXiv cs.CV / 5/4/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

Key Points

  • The paper surveys “world models” for robot learning, emphasizing how predictive environment representations support policy learning, planning, simulation, evaluation, and data generation.
  • It analyzes how world models integrate with robot policies and function as learned simulators for reinforcement learning and assessment.
  • The survey tracks progress in robotic video world models, moving from imagination-based generation toward controllable, structured, and foundation-scale formulations.
  • It links world-model ideas to navigation and autonomous driving and compiles representative datasets, benchmarks, and evaluation protocols.
  • To keep pace with newly emerging work, the authors plan to maintain and regularly update an accompanying GitHub repository.

Abstract

World models, which are predictive representations of how environments evolve under actions, have become a central component of robot learning. They support policy learning, planning, simulation, evaluation, data generation, and have advanced rapidly with the rise of foundation models and large-scale video generation. However, the literature remains fragmented across architectures, functional roles, and embodied application domains. To address this gap, we present a comprehensive review of world models from a robot-learning perspective. We examine how world models are coupled with robot policies, how they serve as learned simulators for reinforcement learning and evaluation, and how robotic video world models have progressed from imagination-based generation to controllable, structured, and foundation-scale formulations. We further connect these ideas to navigation and autonomous driving, and summarize representative datasets, benchmarks, and evaluation protocols. Overall, this survey systematically reviews the rapidly growing literature on world models for robot learning, clarifies key paradigms and applications, and highlights major challenges and future directions for predictive modeling in embodied agents. To facilitate continued access to newly emerging works, benchmarks, and resources, we will maintain and regularly update the accompanying GitHub repository alongside this survey.

World Model for Robot Learning: A Comprehensive Survey | AI Navigate