Predicting States of Understanding in Explanatory Interactions Using Cognitive Load-Related Linguistic Cues

arXiv cs.CL / 3/23/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

共有:

Key Points

The paper investigates how cognitive-load-related linguistic cues—surprisal, syntactic complexity, and listener gaze variation—relate to a listener's moment-by-moment understanding in explanatory dialogue.
It analyzes the MUNDEX corpus with self-annotated listener states (Understanding, Partial Understanding, Non-Understanding, Misunderstanding) via retrospective video recall.
A classification study using two off-the-shelf classifiers and a fine-tuned German BERT-based multimodal classifier demonstrates that four-state understanding can be predicted, with improvements when combining linguistic cues with textual features.
The results indicate that each cue contributes differently to the listener's state and that integrating multiple cues yields better predictive performance, suggesting potential for real-time adaptation in educational or conversational systems.

Abstract

We investigate how verbal and nonverbal linguistic features, exhibited by speakers and listeners in dialogue, can contribute to predicting the listener's state of understanding in explanatory interactions on a moment-by-moment basis. Specifically, we examine three linguistic cues related to cognitive load and hypothesised to correlate with listener understanding: the information value (operationalised with surprisal) and syntactic complexity of the speaker's utterances, and the variation in the listener's interactive gaze behaviour. Based on statistical analyses of the MUNDEX corpus of face-to-face dialogic board game explanations, we find that individual cues vary with the listener's level of understanding. Listener states ('Understanding', 'Partial Understanding', 'Non-Understanding' and 'Misunderstanding') were self-annotated by the listeners using a retrospective video-recall method. The results of a subsequent classification experiment, involving two off-the-shelf classifiers and a fine-tuned German BERT-based multimodal classifier, demonstrate that prediction of these four states of understanding is generally possible and improves when the three linguistic cues are considered alongside textual features.

How AI is Transforming Dynamics 365 Business Central

Dev.to

Algorithmic Gaslighting: A Formal Legal Template to Fight AI Safety Pivots That Cause Psychological Harm

Reddit r/artificial

Do I need different approaches for different types of business information errors?

Dev.to

ShieldCortex: What We Learned Protecting AI Agent Memory

Dev.to

How AI-Powered Revenue Intelligence Transforms B2B Sales Teams

Dev.to

Predicting States of Understanding in Explanatory Interactions Using Cognitive Load-Related Linguistic Cues

Key Points

Abstract

Related Articles

How AI is Transforming Dynamics 365 Business Central

Algorithmic Gaslighting: A Formal Legal Template to Fight AI Safety Pivots That Cause Psychological Harm

Do I need different approaches for different types of business information errors?

ShieldCortex: What We Learned Protecting AI Agent Memory

How AI-Powered Revenue Intelligence Transforms B2B Sales Teams

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer