Nemotron-Cascade 2: Post-Training LLMs with Cascade RL and Multi-Domain On-Policy Distillation
arXiv cs.CL / 3/20/2026
📰 NewsSignals & Early TrendsModels & Research
Key Points
- Nemotron-Cascade 2 is an open-weight 30B MoE model with 3B activated parameters, delivering strong reasoning and agentic capabilities.
- Despite its compact size, it approaches frontier open models in mathematical and coding reasoning, claiming 20x fewer parameters.
- Technical advancements include expanding Cascade RL to cover a broader spectrum of reasoning and agentic domains, plus multi-domain on-policy distillation from top intermediate teacher models to sustain gains.
- The authors are releasing model checkpoints and training data publicly for reproducibility and broader adoption.
