Forecast collapse of transformer-based models under squared loss in financial time series
arXiv stat.ML / 4/2/2026
💬 OpinionIdeas & Deep AnalysisModels & Research
Key Points
- The paper analyzes trajectory forecasting for financial time series under squared loss when conditional structure is weak, showing that the Bayes-optimal predictor becomes effectively degenerate (flat prices and zero returns in typical setups).
- In this degenerate regime, increasing model expressivity—such as with highly expressive Transformer-based predictors—does not reduce bias but instead creates spurious trajectory fluctuations.
- The authors attribute the performance degradation to a variance-driven mechanism caused by reuse of noise, which increases prediction variance without improving the mean prediction.
- They support the theory with numerical experiments on high-frequency EUR/USD exchange-rate data, where Transformer models produce larger trajectory-level forecasting errors than a simple linear benchmark for most windows.
Related Articles

Benchmarking Batch Deep Reinforcement Learning Algorithms
Dev.to

Qwen3.6-Plus: Alibaba's Quiet Giant in the AI Race Delivers a Million-Token Enterprise Powerhouse
Dev.to

How To Leverage AI for Back-Office Headcount Optimization
Dev.to
Is 1-bit and TurboQuant the future of OSS? A simulation for Qwen3.5 models.
Reddit r/LocalLLaMA
SOTA Language Models Under 14B?
Reddit r/LocalLLaMA