Working Notes on Late Interaction Dynamics: Analyzing Targeted Behaviors of Late Interaction Models

arXiv cs.AI / 3/30/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

Key Points

  • The paper studies understudied dynamics in Late Interaction retrieval models, focusing on length bias from multi-vector scoring and the similarity distribution after MaxSim pooling.
  • Experiments on state-of-the-art models using the NanoBEIR benchmark show that the length bias predicted for causal late-interaction models largely holds in practice.
  • It also finds that bi-directional models can experience length bias in extreme cases, indicating the issue is broader than causal variants alone.
  • The authors report no significant similarity trend beyond the top-1 token, suggesting the MaxSim operator effectively leverages token-level matches for retrieval.

Abstract

While Late Interaction models exhibit strong retrieval performance, many of their underlying dynamics remain understudied, potentially hiding performance bottlenecks. In this work, we focus on two topics in Late Interaction retrieval: a length bias that arises when using multi-vector scoring, and the similarity distribution beyond the best scores pooled by the MaxSim operator. We analyze these behaviors for state-of-the-art models on the NanoBEIR benchmark. Results show that while the theoretical length bias of causal Late Interaction models holds in practice, bi-directional models can also suffer from it in extreme cases. We also note that no significant similarity trend lies beyond the top-1 document token, validating that the MaxSim operator efficiently exploits the token-level similarity scores.