LiteResearcher: A Scalable Agentic RL Training Framework for Deep Research Agent
arXiv cs.AI / 4/21/2026
📰 NewsDeveloper Stack & InfrastructureModels & Research
Key Points
- The paper introduces LiteResearcher, a scalable reinforcement-learning (RL) training framework aimed at improving LLM-based “deep research” agents.
- It argues that prior agentic RL scaling is limited by two coupled issues: synthetic training data that doesn’t produce authentic real-world search behavior, and reliance on real-world search during training that causes instability and high cost.
- LiteResearcher addresses this by creating a “lite” virtual world that imitates real-world search dynamics, enabling a continuously improving training recipe.
- The framework allows a small (4B) search agent to outperform much larger models, achieving 71.3% on GAIA and 78.0% on Xbench, setting open-source SOTA results.
- Overall, the work positions scalable RL training as a key enabler for practical and cost-effective deep research agents.



