The production of meaning in the processing of natural language

arXiv cs.CL / 3/24/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

Key Points

  • The paper investigates whether human-like contextual meaning in natural-language processing aligns more with quantum-logic mechanisms than with classical Boolean semantics, and relates this to contextuality observed in large language models via Bell/CHSH-style tests.
  • Across a wide range of model scales, the authors measure the CHSH |S| parameter over inference settings and find that the interquartile range of |S| best distinguishes models, yet it remains orthogonal to standard external performance and safety-related benchmarks.
  • They report that the frequency of Bell-inequality violations has only weak, non-significant anticorrelations with benchmarks such as MMLU, hallucination rate, and nonsense detection.
  • The study examines how |S| changes with sampling parameters and word order, and argues that genuine contextuality introduces information-theoretic constraints relevant to designing defenses against prompt injection, including parallels to social-context manipulation.
  • The authors propose that adversarial tactics may work by shaping the space of possible interpretations at a contextual level, not merely by forcing a specific output, framing this as a fundamental form of manipulation.

Abstract

Understanding the fundamental mechanisms governing the production of meaning in the processing of natural language is critical for designing safe, thoughtful, engaging, and empowering human-agent interactions. Experiments in cognitive science and social psychology have demonstrated that human semantic processing exhibits contextuality more consistent with quantum logical mechanisms than classical Boolean theories, and recent works have found similar results in large language models -- in particular, clear violations of the Bell inequality in experiments of contextuality during interpretation of ambiguous expressions. We explore the CHSH |S| parameter -- the metric associated with the inequality -- across the inference parameter space of models spanning four orders of magnitude in scale, cross-referencing it with MMLU, hallucination rate, and nonsense detection benchmarks. We find that the interquartile range of the |S| distribution -- the statistic that most sharply differentiates models from one another -- is completely orthogonal to all external benchmarks, while violation rate shows weak anticorrelation with all three benchmarks that does not reach significance. We investigate how |S| varies with sampling parameters and word order, and discuss the information-theoretic constraints that genuine contextuality imposes on prompt injection defenses and its human analogue, whereby careful construction and maintenance of social contextuality can be carried out at scale -- manufacturing not consent but contextuality itself, a subtler and more fundamental form of manipulation that shapes the space of possible interpretations before any particular one is reached.

The production of meaning in the processing of natural language | AI Navigate