AI Navigate

IQuest-Coder-V1 Technical Report

arXiv cs.AI / 3/18/2026

📰 NewsIndustry & Market MovesModels & Research

Key Points

  • The IQuest-Coder-V1 family introduces code LLMs (7B/14B/40B/40B-Loop) and a code-flow multi-stage training paradigm that models the evolving logic of software pipelines.
  • The training pipeline includes an initial pre-training on code facts, repository data, and completion data; a mid-training stage with reasoning and agentic trajectories at 32k-context and repository-scale 128k-context; and a post-training phase for specialized coding capabilities via a thinking path (reasoning-driven RL) and an instruct path (general assistance).
  • The IQuest-Coder-V1-Loop variant adds a recurrent mechanism to balance model capacity and deployment footprint, enabling an efficiency-focused deployment path.
  • The authors claim state-of-the-art performance in code intelligence across agentic software engineering, competitive programming, and complex tool use.
  • They release the complete white-box chain of checkpoints from pre-training bases to final thinking and instruction models, aiming to advance autonomous code intelligence and real-world agentic systems.

Abstract

In this report, we introduce the IQuest-Coder-V1 series-(7B/14B/40B/40B-Loop), a new family of code large language models (LLMs). Moving beyond static code representations, we propose the code-flow multi-stage training paradigm, which captures the dynamic evolution of software logic through different phases of the pipeline. Our models are developed through the evolutionary pipeline, starting with the initial pre-training consisting of code facts, repository, and completion data. Following that, we implement a specialized mid-training stage that integrates reasoning and agentic trajectories in 32k-context and repository-scale in 128k-context to forge deep logical foundations. The models are then finalized with post-training of specialized coding capabilities, which is bifurcated into two specialized paths: the thinking path (utilizing reasoning-driven RL) and the instruct path (optimized for general assistance). IQuest-Coder-V1 achieves state-of-the-art performance among competitive models across critical dimensions of code intelligence: agentic software engineering, competitive programming, and complex tool use. To address deployment constraints, the IQuest-Coder-V1-Loop variant introduces a recurrent mechanism designed to optimize the trade-off between model capacity and deployment footprint, offering an architecturally enhanced path for efficacy-efficiency trade-off. We believe the release of the IQuest-Coder-V1 series, including the complete white-box chain of checkpoints from pre-training bases to the final thinking and instruction models, will advance research in autonomous code intelligence and real-world agentic systems.