What can LLMs tell us about the mechanisms behind polarity illusions in humans? Experiments across model scales and training steps
arXiv cs.CL / 3/31/2026
💬 OpinionIdeas & Deep AnalysisModels & Research
Key Points
- The paper uses the Pythia scaling suite to test whether the NPI and depth charge polarity illusions observed in humans also emerge in LLMs across model sizes and training steps.
- It finds that the NPI illusion weakens and eventually disappears as model scale increases, while the depth charge illusion strengthens in larger models.
- The authors argue that these patterns may reduce the need to posit human “rational inference” mechanisms that transform ill-formed sentences into well-formed ones, since LLMs cannot plausibly perform such implicit next-token reasoning.
- Instead, the results suggest that shallow, “good enough” processing and/or partial grammaticalization of prescriptively ungrammatical structures could explain the illusions in both models and humans.
- The study proposes a unifying theoretical synthesis grounded in construction grammar to relate these mechanisms across the different illusion types.
Related Articles
Why AI agent teams are just hoping their agents behave
Dev.to

Harness as Code: Treating AI Workflows Like Infrastructure
Dev.to

How to Make Claude Code Better at One-Shotting Implementations
Towards Data Science

The Crypto AI Agent Stack That Costs $0/Month to Run
Dev.to

Bag of Freebies for Training Object Detection Neural Networks
Dev.to