Rhetorical Questions in LLM Representations: A Linear Probing Study
arXiv cs.CL / 4/16/2026
💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research
Key Points
- The study investigates how large language models encode rhetorical questions versus information-seeking questions by applying linear probing to two social-media datasets with different discourse contexts.
- It finds that rhetorical signals appear early in the model representations and are most consistently captured by last-token features, with rhetorical questions becoming linearly separable from information-seeking ones within each dataset.
- Cross-dataset transfer shows rhetorical question detection remains feasible, achieving AUROC around 0.7–0.8, indicating that useful signals generalize to some extent.
- Despite moderate transfer performance, the paper shows that “transferability” does not mean a single shared representation: probes trained on different datasets yield very different rankings on the same target corpus.
- Qualitative analysis attributes these probe divergences to multiple underlying rhetorical phenomena, including discourse-level stance across extended argumentation and more localized, syntax-driven interrogative cues.
Related Articles
Which Version of Qwen 3.6 for M5 Pro 24g
Reddit r/LocalLLaMA

From Theory to Reality: Why Most AI Agent Projects Fail (And How Mine Did Too)
Dev.to

GPT-5.4-Cyber: OpenAI's Game-Changer for AI Security and Defensive AI
Dev.to

Building Digital Souls: The Brutal Reality of Creating AI That Understands You Like Nobody Else
Dev.to
Local LLM Beginner’s Guide (Mac - Apple Silicon)
Reddit r/artificial