Investigating More Explainable and Partition-Free Compositionality Estimation for LLMs: A Rule-Generation Perspective

arXiv cs.AI / 5/1/2026

📰 NewsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper argues that existing compositional generalization tests for evaluating LLM compositionality mainly measure outputs and thus provide limited explainability about what the model learned.
  • It highlights that current test-set construction often depends on dataset partitioning, which can introduce combination leakage when “unseen” combinations are still indirectly revealed.
  • The authors propose a rule-generation perspective where the LLM generates a program-like set of rules for dataset mapping, enabling complexity-theory-based compositionality estimates.
  • Experiments on a string-to-grid task using this framework reveal that different advanced LLMs exhibit distinct compositionality profiles, including multiple compositionality deficiencies.

Abstract

Compositional generalization tests are often used to estimate the compositionality of LLMs. However, such tests have the following limitations: (1) they only focus on the output results without considering LLMs' understanding of sample compositionality, resulting in explainability defects; (2) they rely on dataset partition to form the test set with combinations unseen in the training set, suffering from combination leakage issues. In this work, we propose a novel rule-generation perspective for compositionality estimation for LLMs. It requires LLMs to generate a program as rules for dataset mapping and provides estimates of the compositionality of LLMs using complexity-based theory. The perspective addresses the limitations of compositional generalization tests and provides a new way to analyze the compositionality characterization of LLMs. We conduct experiments and analysis of existing advanced LLMs based on this perspective on a string-to-grid task, and find various compositionality characterizations and compositionality deficiencies exhibited by LLMs.