Quality Over Clicks: Intrinsic Quality-Driven Iterative Reinforcement Learning for Cold-Start E-Commerce Query Suggestion
arXiv cs.CL / 3/25/2026
📰 NewsSignals & Early TrendsIdeas & Deep AnalysisModels & Research
Key Points
- The paper addresses cold-start e-commerce Query Suggestion (EQS) where existing LLM+CTR approaches underperform due to limited click data needed to train CTR models.
- It proposes Cold-EQS, an iterative reinforcement learning framework that optimizes suggestion quality using rewards based on answerability, factuality, and information gain.
- Cold-EQS selects hard and ambiguous samples by estimating uncertainty over grouped candidate suggested queries, enabling learning from user queries without click signals.
- The authors introduce the EQS-Benchmark with 16,949 online user queries for offline training and evaluation, supporting reproducible experiments.
- Experiments show a positive correlation between offline and online effectiveness and report a significant +6.81% improvement in online chatUV versus prior approaches.
Related Articles
Santa Augmentcode Intent Ep.6
Dev.to

Your Agent Hired Another Agent. The Output Was Garbage. The Money's Gone.
Dev.to
ClawRouter vs TeamoRouter: one requires a crypto wallet, one doesn't
Dev.to
Big Tech firms are accelerating AI investments and integration, while regulators and companies focus on safety and responsible adoption.
Dev.to

Palantir’s billionaire CEO says only two kinds of people will succeed in the AI era: trade workers — ‘or you’re neurodivergent’
Reddit r/artificial