Vintix II: Decision Pre-Trained Transformer is a Scalable In-Context Reinforcement Learner
arXiv cs.LG / 4/8/2026
💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research
Key Points
- The paper presents Vintix II, a Decision Pre-Trained Transformer (DPT) extended to large-scale, diverse multi-domain in-context reinforcement learning.
- It uses Flow Matching as a training method to scale DPT while maintaining an interpretation aligned with Bayesian posterior sampling.
- Experiments across hundreds of diverse tasks show improved generalization to held-out test tasks compared with prior Algorithm Distillation (AD) scaling approaches.
- The resulting agent delivers stronger performance in both online and offline inference, positioning ICRL as a viable alternative to expert distillation for generalist agents.
- Overall, the work addresses a key open question: whether DPT-style ICRL can be made genuinely scalable beyond simplified environments.
Related Articles

Black Hat Asia
AI Business
[N] Just found out that Milla Jovovich is a dev, invested in AI, and just open sourced a project
Reddit r/MachineLearning

ALTK‑Evolve: On‑the‑Job Learning for AI Agents
Hugging Face Blog

Context Windows Are Getting Absurd — And That's a Good Thing
Dev.to

Every AI Agent Registry in 2026, Compared
Dev.to