AgenticQwen: Training Small Agentic Language Models with Dual Data Flywheels for Industrial-Scale Tool Use
arXiv cs.CL / 4/24/2026
📰 NewsDeveloper Stack & InfrastructureTools & Practical UsageModels & Research
Key Points
- The paper introduces the AgenticQwen model family, designed as small agentic language models for industrial tool use under tight latency and cost constraints.
- Training uses multi-round reinforcement learning on a mix of synthetic data and limited open-source data, combining reasoning-focused RL with agentic RL.
- It proposes dual “data flywheels” that automatically generate progressively harder tasks: a reasoning flywheel that learns from errors and an agentic flywheel that converts linear workflows into branching behavior trees.
- The authors validate performance on public agent benchmarks and also test in an industrial agent system, reporting strong benchmark results and improved parity with much larger models on search and data analysis.
- They provide model checkpoints and parts of the synthetic dataset on Hugging Face, along with data synthesis and RL training code and an integration into EasyDistill.




