Human Cognition in Machines: A Unified Perspective of World Models

arXiv cs.RO / 4/21/2026

📰 NewsSignals & Early TrendsIdeas & Deep AnalysisModels & Research

共有:

Key Points

The report argues that claims of “human-like” cognition in world models should be assessed using first principles from Cognitive Architecture Theory (CAT).
It proposes a unified world-model framework that integrates CAT-related cognitive functions, including memory, perception, language, reasoning, imagination, motivation, and meta-cognition.
The authors identify major research gaps, especially around motivation (notably intrinsic motivation) and meta-cognition, and outline future directions to address them.
They introduce “Epistemic World Models,” framing a new category of agent frameworks aimed at scientific discovery over structured knowledge.
Applying their taxonomy to video, embodied, and epistemic world models, the work suggests additional research directions not covered by prior taxonomies.

Abstract

This comprehensive report distinguishes prior works by the cognitive functions they innovate. Many works claim an almost "human-like" cognitive capability in their world models. To evaluate these claims requires a proper grounding in first principles in Cognitive Architecture Theory (CAT). We present a conceptual unified framework for world models that fully incorporates all the cognitive functions associated with CAT (i.e. memory, perception, language, reasoning, imagining, motivation, and meta-cognition) and identify gaps in the research as a guide for future states of the art. In particular, we find that motivation (especially intrinsic motivation) and meta-cognition remain drastically under-researched, and we propose concrete directions informed by active inference and global workspace theory to address them. We further introduce Epistemic World Models, a new category encompassing agent frameworks for scientific discovery that operate over structured knowledge. Our taxonomy, applied across video, embodied, and epistemic world models, suggests research directions where prior taxonomies have not.