Beyond Bellman: High-Order Generator Regression for Continuous-Time Policy Evaluation
arXiv stat.ML / 4/22/2026
📰 NewsModels & Research
Key Points
- The paper addresses finite-horizon continuous-time policy evaluation using discrete closed-loop trajectories when the system dynamics are time-inhomogeneous.
- It shows that a standard Bellman one-step recursion baseline is only first-order accurate with respect to grid width, limiting performance.
- The authors estimate a time-dependent generator from multi-step transitions using moment-matching coefficients that cancel lower-order discretization (truncation) terms.
- The proposed approach combines a surrogate generator with backward regression and provides an end-to-end error decomposition covering generator misspecification, projection error, pooling bias, finite-sample error, and start-up error.
- Calibration and benchmarking across multiple scales, along with ablations and stress tests, demonstrate that a second-order estimator improves over the Bellman baseline and stays stable in the theoretically predicted regime where gains are observable.
Related Articles
I’m working on an AGI and human council system that could make the world better and keep checks and balances in place to prevent catastrophes. It could change the world. Really. Im trying to get ahead of the game before an AGI is developed by someone who only has their best interest in mind.
Reddit r/artificial
Deepseek V4 Flash and Non-Flash Out on HuggingFace
Reddit r/LocalLLaMA

DeepSeek V4 Flash & Pro Now out on API
Reddit r/LocalLLaMA

From "Hello World" to "Hello Agents": The Developer Keynote That Rewired Software Engineering
Dev.to

AI swarms could hijack democracy without anyone noticing
Reddit r/artificial