From Gradients to Riccati Geometry: Kalman World Models for Single-Pass Learning
arXiv cs.LG / 3/17/2026
📰 NewsIdeas & Deep AnalysisModels & Research
Key Points
- The paper introduces Kalman World Models (KWM), a gradient-free framework for training state-space models via recursive Bayesian filtering instead of backpropagation.
- It replaces parameter learning with Kalman-style gain adaptation, turning training into online filtering and making error signals act as innovations.
- The approach is extended to transformer-based large language models (LLMs), where internal activations are treated as latent dynamical states corrected via innovation terms for gradient-free training and adaptation.
- The authors derive stability conditions, analyze computational complexity, and report empirical results on sequence modeling that show competitive performance with improved robustness and continual adaptation.
- This work presents a control-theory grounded alternative to traditional gradient-based learning for sequential models, with potential implications for online learning and model robustness.
Related Articles
Automating the Chase: AI for Festival Vendor Compliance
Dev.to
MCP Skills vs MCP Tools: The Right Way to Configure Your Server
Dev.to
500 AI Prompts Every Content Creator Needs in 2026 (20 Free Samples)
Dev.to
Building a Game for My Daughter with AI — Part 1: What If She Could Build It Too?
Dev.to

Math needs thinking time, everyday knowledge needs memory, and a new Transformer architecture aims to deliver both
THE DECODER