Revisiting Real-Time Digging-In Effects: No Evidence from NP/Z Garden-Paths

arXiv cs.CL / 3/26/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

共有:

Key Points

The paper addresses whether “digging-in” effects in NP/Z garden-path sentences are genuine real-time evidence for self-organized sentence processing or instead arise from wrap-up processes and methodological confounds.
Two experiments (Maze task and self-paced reading) compare human results against predictions from an ensemble of large language models.
The study finds no clear evidence of real-time digging-in effects in human sentence processing.
Sentence-final vs. nonfinal disambiguation items produce qualitatively different patterns, with seemingly positive digging-in trends occurring only sentence-finally due to wrap-up confounds.
Nonfinal items show reverse trends that align with neural language model predictions, supporting the surprisal-theory expectation of no digging-in under non-shifting statistical expectations.

Abstract

Digging-in effects, where disambiguation difficulty increases with longer ambiguous regions, have been cited as evidence for self-organized sentence processing, in which structural commitments strengthen over time. In contrast, surprisal theory predicts no such effect unless lengthening genuinely shifts statistical expectations, and neural language models appear to show the opposite pattern. Whether digging-in is a robust real-time phenomenon in human sentence processing -- or an artifact of wrap-up processes or methodological confounds -- remains unclear. We report two experiments on English NP/Z garden-path sentences using Maze and self-paced reading, comparing human behavior with predictions from an ensemble of large language models. We find no evidence for real-time digging-in effects. Critically, items with sentence-final versus nonfinal disambiguation show qualitatively different patterns: positive digging-in trends appear only sentence-finally, where wrap-up effects confound interpretation. Nonfinal items -- the cleaner test of real-time processing -- show reverse trends consistent with neural model predictions.

AgentDesk vs Hiring Another Consultant: A Cost Comparison

Dev.to

"Why Your AI Agent Needs a System 1"

Dev.to

When should we expect TurboQuant?

Reddit r/LocalLLaMA

AI as Your Customs Co-Pilot: Automating HS Code Chaos in Southeast Asia

Dev.to

The Instruction Hierarchy: Training LLMs to Prioritize Privileged Instructions

Dev.to

Revisiting Real-Time Digging-In Effects: No Evidence from NP/Z Garden-Paths

Key Points

Abstract

Related Articles

AgentDesk vs Hiring Another Consultant: A Cost Comparison

"Why Your AI Agent Needs a System 1"

When should we expect TurboQuant?

AI as Your Customs Co-Pilot: Automating HS Code Chaos in Southeast Asia

The Instruction Hierarchy: Training LLMs to Prioritize Privileged Instructions

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer