User Preference Modeling for Conversational LLM Agents: Weak Rewards from Retrieval-Augmented Interaction
arXiv cs.CL / 3/24/2026
💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research
Key Points
- The paper introduces VARS (Vector-Adapted Retrieval Scoring), a pipeline-agnostic framework that builds persistent user preference representations using long-term and short-term vectors to bias retrieval in conversational LLM agents.
- VARS updates user preference vectors online using only weak scalar feedback signals, avoiding per-user fine-tuning while still enabling personalization across sessions.
- Experiments on the MultiSessionCollab benchmark for math and code tasks show that user-aware retrieval primarily improves interaction efficiency—such as reduced timeouts and lower user effort—rather than delivering major raw accuracy gains under frozen LLM backbones.
- The proposed dual-vector design is evaluated as interpretable, with long-term vectors reflecting cross-user preference overlap and short-term vectors adapting to session-specific behavior.
- The authors provide code, model, and data via the linked GitHub repository, supporting reproducibility and further development of user preference-aware retrieval methods.
Related Articles
Santa Augmentcode Intent Ep.6
Dev.to

Your Agent Hired Another Agent. The Output Was Garbage. The Money's Gone.
Dev.to
ClawRouter vs TeamoRouter: one requires a crypto wallet, one doesn't
Dev.to
Big Tech firms are accelerating AI investments and integration, while regulators and companies focus on safety and responsible adoption.
Dev.to

Palantir’s billionaire CEO says only two kinds of people will succeed in the AI era: trade workers — ‘or you’re neurodivergent’
Reddit r/artificial