Impact of automatic speech recognition quality on Alzheimer's disease detection from spontaneous speech: a reproducible benchmark study with lexical modeling and statistical validation
arXiv cs.CL / 3/20/2026
💬 OpinionIdeas & Deep AnalysisModels & Research
Key Points
- The paper analyzes Alzheimer's disease detection using lexical features from Whisper ASR transcripts on the ADReSSo 2021 dataset to understand how ASR quality affects downstream language models.
- It finds that Whisper-small transcripts outperform Whisper-base transcripts, achieving balanced accuracy over 0.7850 with a Linear SVM, indicating transcription quality matters more than classifier complexity.
- The results show linguistic differences: cognitively normal speakers use more semantically precise object- and scene-descriptive language, while Alzheimer's speech is characterized by vagueness, discourse markers, and hesitations.
- The authors provide a reproducible benchmark pipeline and argue that ASR selection is a critical modeling decision for clinical speech-based AI systems.
Related Articles
The Markup
Dev.to

OpenSeeker's open-source approach aims to break up the data monopoly for AI search agents
THE DECODER

How to Choose the Best AI Chat Models of 2026 for Your Business Needs
Dev.to

I built an AI that generates lesson plans in your exact teaching voice (open source)
Dev.to

How to Master AI Tools in 2026: A Comprehensive Guide
Dev.to