QuarkMedSearch: A Long-Horizon Deep Search Agent for Exploring Medical Intelligence

arXiv cs.AI / 4/15/2026

📰 NewsSignals & Early TrendsIdeas & Deep AnalysisModels & Research

共有:

Key Points

The paper introduces QuarkMedSearch, a long-horizon, agentic deep search model tailored to Chinese medical intelligence tasks built on Tongyi DeepResearch.
It proposes an end-to-end pipeline covering medical multi-hop data construction, a two-stage training approach (SFT followed by RL), and benchmark-based evaluation.
To mitigate medical deep-search data scarcity, the method combines a large medical knowledge graph with real-time online exploration to generate long-horizon training trajectories.
Training is designed to progressively improve planning, tool use, and reflection while preserving search efficiency.
The QuarkMedSearch Benchmark is created with medical experts and manual verification, and results show state-of-the-art performance among open-source models of similar scale while remaining competitive on general benchmarks.

Abstract

As agentic foundation models continue to evolve, how to further improve their performance in vertical domains has become an important challenge. To this end, building upon Tongyi DeepResearch, a powerful agentic foundation model, we focus on the Chinese medical deep search scenario and propose QuarkMedSearch, systematically exploring a full-pipeline approach spanning medical multi-hop data construction, training strategies, and evaluation benchmarks to further push and assess its performance upper bound in vertical domains. Specifically, for data synthesis, to address the scarcity of deep search training data in the medical domain, we combine a large-scale medical knowledge graph with real-time online exploration to construct long-horizon medical deep search training data; for post-training, we adopt a two-stage SFT and RL training strategy that progressively enhances the model's planning, tool invocation, and reflection capabilities required for deep search, while maintaining search efficiency; for evaluation, we collaborate with medical experts to construct the QuarkMedSearch Benchmark through rigorous manual verification. Experimental results demonstrate that QuarkMedSearch achieves state-of-the-art performance among open-source models of comparable scale on the QuarkMedSearch Benchmark, while also maintaining strong competitiveness on general benchmarks.

Black Hat Asia

AI Business

Are gamers being used as free labeling labor? The rise of "Simulators" that look like AI training grounds [D]

Reddit r/MachineLearning

I built a trading intelligence MCP server in 2 days — here's how

Dev.to

Big Tech firms are accelerating AI investments and integration, while regulators and companies focus on safety and responsible adoption.

Dev.to

Qwen3.5-35B running well on RTX4060 Ti 16GB at 60 tok/s

Reddit r/LocalLLaMA

QuarkMedSearch: A Long-Horizon Deep Search Agent for Exploring Medical Intelligence

Key Points

Abstract

Related Articles

Black Hat Asia

Are gamers being used as free labeling labor? The rise of "Simulators" that look like AI training grounds [D]

I built a trading intelligence MCP server in 2 days — here's how

Big Tech firms are accelerating AI investments and integration, while regulators and companies focus on safety and responsible adoption.

Qwen3.5-35B running well on RTX4060 Ti 16GB at 60 tok/s

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer