Multi-Faceted Self-Consistent Preference Alignment for Query Rewriting in Conversational Search

arXiv cs.CL / 4/9/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

Key Points

  • The paper addresses Conversational Query Rewriting (CQR) by arguing that optimizing rewrites in isolation is insufficient because rewrites should account for downstream effects on retrieval and response generation.
  • It introduces MSPA-CQR, which builds self-consistent preference-alignment data across three dimensions—rewriting, passage retrieval, and response—to produce more diverse rewritten queries.
  • The method uses “prefix guided multi-faceted direct preference optimization” to learn and reconcile preferences from the three dimensions during training.
  • Experiments reported in the abstract indicate the approach improves CQR performance in both in-distribution and out-of-distribution settings, suggesting better robustness.

Abstract

Conversational Query Rewriting (CQR) aims to rewrite ambiguous queries to achieve more efficient conversational search. Early studies have predominantly focused on the rewriting in isolation, ignoring the feedback from query rewrite, passage retrieval and response generation in the rewriting process. To address this issue, we propose Multi-Faceted Self-Consistent Preference Aligned CQR (MSPA-CQR). Specifically, we first construct self-consistent preference alignment data from three dimensions (rewriting, retrieval, and response) to generate more diverse rewritten queries. Then we propose prefix guided multi-faceted direct preference optimization to learn preference information from three different dimensions. The experimental results show that our MSPA-CQR is effective in both in- and out-of-distribution scenarios.

Multi-Faceted Self-Consistent Preference Alignment for Query Rewriting in Conversational Search | AI Navigate