AI Navigate

Interplay: Training Independent Simulators for Reference-Free Conversational Recommendation

arXiv cs.AI / 3/20/2026

📰 NewsModels & Research

Key Points

  • The paper proposes a reference-free simulation framework for training conversational recommender systems using two independent LLMs, one acting as the user and one as the recommender, interacting in real time.
  • These models operate without access to predetermined target items, instead using preference summaries and target attributes to enable the recommender to infer user preferences through dialogue.
  • The approach yields more realistic and diverse conversations that better reflect authentic human-AI interactions and offers a scalable method for generating high-quality CRS data.
  • Quantitative and human evaluations show that the reference-free simulators match or exceed existing methods in quality and effectiveness.

Abstract

Training conversational recommender systems (CRS) requires extensive dialogue data, which is challenging to collect at scale. To address this, researchers have used simulated user-recommender conversations. Traditional simulation approaches often utilize a single large language model (LLM) that generates entire conversations with prior knowledge of the target items, leading to scripted and artificial dialogues. We propose a reference-free simulation framework that trains two independent LLMs, one as the user and one as the conversational recommender. These models interact in real-time without access to predetermined target items, but preference summaries and target attributes, enabling the recommender to genuinely infer user preferences through dialogue. This approach produces more realistic and diverse conversations that closely mirror authentic human-AI interactions. Our reference-free simulators match or exceed existing methods in quality, while offering a scalable solution for generating high-quality conversational recommendation data without constraining conversations to pre-defined target items. We conduct both quantitative and human evaluations to confirm the effectiveness of our reference-free approach.