AI Navigate

Language Model Maps for Prompt-Response Distributions via Log-Likelihood Vectors

arXiv cs.CL / 3/20/2026

📰 NewsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper proposes representing language models by log-likelihood vectors over prompt-response pairs to compare their conditional distributions.
  • It shows that distances between models in this space approximate the KL divergence between the corresponding conditional distributions.
  • Experiments on a large collection of publicly available language models demonstrate that the maps reveal meaningful global structure, relate to model attributes and task performance.
  • The approach captures systematic shifts induced by prompt modifications and shows approximate additive compositionality, enabling prediction of composite prompt effects.
  • It introduces PMI vectors to reduce the influence of unconditional distributions, sometimes better reflecting training-data differences and aiding analysis of input-dependent model behavior.

Abstract

We propose a method that represents language models by log-likelihood vectors over prompt-response pairs and constructs model maps for comparing their conditional distributions. In this space, distances between models approximate the KL divergence between the corresponding conditional distributions. Experiments on a large collection of publicly available language models show that the maps capture meaningful global structure, including relationships to model attributes and task performance. The method also captures systematic shifts induced by prompt modifications and their approximate additive compositionality, suggesting a way to analyze and predict the effects of composite prompt operations. We further introduce pointwise mutual information (PMI) vectors to reduce the influence of unconditional distributions; in some cases, PMI-based model maps better reflect training-data-related differences. Overall, the framework supports the analysis of input-dependent model behavior.