From Recency Bias to Stable Convergence Block Kaczmarz Methods for Online Preference Learning in Matchmaking Applications

arXiv cs.LG / 4/14/2026

💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research

共有:

Key Points

The paper introduces Kaczmarz-based online preference learning algorithms designed for real-time personalized matchmaking in reciprocal recommender systems.
It shows that post-step L2 normalization used in Kaczmarz-inspired online learners creates an exponential recency bias, making older interactions effectively vanish after only a small number of swipes.
To address this, the authors replace the normalization step with a Tikhonov-regularized projection denominator that bounds step size analytically while preserving interaction history.
The work further proposes an adaptive variant for cases where candidate tag vectors are not pre-normalized, producing per-candidate step sizes via the ||a||^2 + alpha denominator.
In large-scale simulation (6,400 swipes), BlockNK is reported to achieve the best preference alignment and direction stability under label noise, and candidate filtering improves asymptotic alignment while potentially adding a feedback loop risk.

Abstract

We present a family of Kaczmarz-based preference learning algorithms for real-time personalized matchmaking in reciprocal recommender systems. Post-step L2 normalization, common in Kaczmarz-inspired online learners, induces exponential recency bias: the influence of the t-th interaction decays as eta^(n - t), reaching approximately 1e-6 after just 20 swipes at eta = 0.5. We resolve this by replacing the normalization step with a Tikhonov-regularized projection denominator that bounds step size analytically without erasing interaction history. When candidate tag vectors are not pre-normalized, as in realistic deployments where candidates vary in tag density, the Tikhonov denominator ||a||^2 + alpha produces genuinely per-candidate adaptive step sizes, making it structurally distinct from online gradient descent with any fixed learning rate. We further derive a block variant that processes full swipe sessions as a single Gram matrix solve. Population-scale simulation over 6,400 swipes reveals that Block Normalized Kaczmarz (BlockNK), which combines the batch Gram solve with post-session L2 normalization, achieves the highest preference alignment (Align@20 = 0.698), the strongest inter-session direction stability (delta = 0.994), and the flattest degradation profile under label noise across flip ratios p_flip in [0.10, 0.35]. Experiments under cosine similarity subsampling further show that adaptively filtering the candidate pool toward the current preference direction substantially improves asymptotic alignment, at the cost of introducing a feedback loop that may slow recovery from miscalibration. The sequential Tikhonov-Kaczmarz method performs comparably to K-NoNorm under our simulation conditions, suggesting the dominant practical gain over normalized Kaczmarz is the removal of per-step normalization rather than the Tikhonov constant alpha itself.