Teacher Forcing as Generalized Bayes: Optimization Geometry Mismatch in Switching Surrogates for Chaotic Dynamics
arXiv cs.LG / 4/29/2026
📰 NewsIdeas & Deep AnalysisModels & Research
Key Points
- The paper studies Identity Teacher Forcing (ITF) for training deterministic recurrent surrogates for chaotic dynamical systems, showing it can work well for dynamical system reconstruction (DSR) with RNNs.
- It argues that ITF’s intervention-based prediction loss (viewed as a generalized Bayes update) can mismatch the free-running model’s marginal likelihood “geometry,” leading to different objective curvatures.
- Using a probabilistic switching augmentation of almost-linear RNNs (AL-RNNs), the authors compare ITF vs marginal-likelihood curvature and use Louis’ identity to estimate ambiguity-aware observed information.
- In their switching experiments (including Lorenz-63), conditioning on a single forced regime increases curvature, while marginal likelihood curvature is reduced via a missing-information correction when multiple switching explanations are plausible.
- They find that while windowed evidence fine-tuning can improve held-out evidence, it may worsen dynamical quantities of interest (QoIs) compared with models pretrained under ITF.
Related Articles

How I Use AI Agents to Maintain a Living Knowledge Base for My Team
Dev.to
IK_LLAMA now supports Qwen3.5 MTP Support :O
Reddit r/LocalLLaMA
OpenAI models, Codex, and Managed Agents come to AWS
Dev.to

Indian Developers: How to Build AI Side Income with $0 Capital in 2026
Dev.to

Vertical SaaS for Startups 2026: Building a Niche AI-First Product
Dev.to