WebExpert: domain-aware web agents with critic-guided expert experience for high-precision search

arXiv cs.AI / 4/10/2026

💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research

Key Points

  • WebExpert is proposed as a domain-aware web agent for specialized search tasks in areas like finance and biomedicine, aiming to reduce query drift, noisy evidence, and brittle reasoning through domain priors.
  • The system combines sentence-level experience retrieval with topic merging and rule distillation, plus a “schemalight” facet induction method that bootstraps time/region/policy/industry facets from weak supervision rather than hand-written lexicons.
  • It uses preference-optimized planning that jointly improves query planning and retrieval via pairwise preference learning with a coverage-aware objective, and an inference-time experience gate that biases decoding toward relevant facets with fallback under low retrieval confidence.
  • Experiments on GAIA, GPQA, HLE, and WebWalkerQA show Answer Exact Match improvements of 1.5–3.6 percentage points over the strongest browsing baseline and fewer page hops, with ablations supporting the contributions of each component.

Abstract

Specialized web tasks in finance, biomedicine, and pharmaceuticals remain challenging due to missing domain priors: queries drift, evidence is noisy, and reasoning is brittle. We present WebExpert, a domain-aware web agent that we implement end-to-end, featuring : (i) sentence-level experience retrieval with topic merging and rule distillation, (ii) schemalight facet induction that bootstraps time,region,policy,industry facets from weak supervision instead of static hand-written lexicons, and (iii) preference-optimized planning that jointly improves query planning and retrieval via pairwise preference learning alongside a coverage-aware objective. At inference, a lightweight experience gate biases decoding toward active facets with fallback under low-retrieval confidence. On GAIA, GPQA, HLE, and WebWalkerQA, WebExpert improves Answer Exact Match (EM) by 1.5-3.6 pp over the strongest browsing baseline and reduces page hops. Analysis shows consistent gains and ablations on retrieval, topic merging, facet induction, and preference-aware training.