Repetition Without Exclusivity: Scale Sensitivity of Referential Mechanisms in Child-Scale Language Models
arXiv cs.CL / 3/17/2026
📰 NewsIdeas & Deep AnalysisModels & Research
Key Points
- The study provides the first systematic evaluation of mutual exclusivity in text-only language models trained on child-directed speech, using referential suppression in two-referent contexts as the operational metric.
- In pilot experiments, a masked model (BabyBERTa) is insensitive to multi-sentence referential context, while autoregressive models show anti-ME repetition priming when familiar nouns are relabelled.
- A context-dependence diagnostic reveals that apparent ME-like patterns with nonce tokens are fully explained by embedding similarity, not genuine referential disambiguation.
- In a confirmatory experiment with 45 GPT-2-architecture models, anti-ME repetition priming is significant across all conditions, priming attenuates with better language modelling but does not disappear, and the diagnostic replicates across all cells, indicating robust repetition-based reference tracking driven by distributional learning.




