OneSearch-V2: The Latent Reasoning Enhanced Self-distillation Generative Search Framework

arXiv cs.CL / 3/26/2026

💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research

Key Points

  • OneSearch-V2 is a proposed upgrade to the OneSearch generative retrieval framework, aiming to improve complex query understanding, latent intent utilization, and robustness beyond narrow historical preferences.
  • The approach introduces a thought-augmented query understanding module, a reasoning-internalized self-distillation training pipeline for uncovering precise e-commerce intents, and a behavior preference alignment system to reduce reward hacking from single-metric optimization.
  • Offline experiments report strong gains in query recognition and user profiling quality, while online A/B tests show measurable business lift (+3.98% item CTR, +3.05% buyer conversion, +2.11% order volume).
  • Manual evaluation indicates improved user-facing search quality (+1.65% page good rate, +1.37% query-item relevance), and the method reportedly mitigates issues like information bubbles and long-tail sparsity without increasing inference cost or latency.
  • Overall, the paper frames OneSearch-V2 as an efficiency-conscious, training-focused generative search improvement rather than a heavier inference-time model change.

Abstract

Generative Retrieval (GR) has emerged as a promising paradigm for modern search systems. Compared to multi-stage cascaded architecture, it offers advantages such as end-to-end joint optimization and high computational efficiency. OneSearch, as a representative industrial-scale deployed generative search framework, has brought significant commercial and operational benefits. However, its inadequate understanding of complex queries, inefficient exploitation of latent user intents, and overfitting to narrow historical preferences have limited its further performance improvement. To address these challenges, we propose \textbf{OneSearch-V2}, a latent reasoning enhanced self-distillation generative search framework. It contains three key innovations: (1) a thought-augmented complex query understanding module, which enables deep query understanding and overcomes the shallow semantic matching limitations of direct inference; (2) a reasoning-internalized self-distillation training pipeline, which uncovers users' potential yet precise e-commerce intentions beyond log-fitting through implicit in-context learning; (3) a behavior preference alignment optimization system, which mitigates reward hacking arising from the single conversion metric, and addresses personal preference via direct user feedback. Extensive offline evaluations demonstrate OneSearch-V2's strong query recognition and user profiling capabilities. Online A/B tests further validate its business effectiveness, yielding +3.98\% item CTR, +3.05\% buyer conversion rate, and +2.11\% order volume. Manual evaluation further confirms gains in search experience quality, with +1.65\% in page good rate and +1.37\% in query-item relevance. More importantly, OneSearch-V2 effectively mitigates common search system issues such as information bubbles and long-tail sparsity, without incurring additional inference costs or serving latency.