From Dispersion to Attraction: Spectral Dynamics of Hallucination Across Whisper Model Scales

arXiv cs.AI / 4/13/2026

💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper introduces a Spectral Sensitivity Theorem claiming deep networks undergo a phase transition from a dispersive regime to an attractor regime as layer-wise gain and alignment change under stress.
  • It frames ASR hallucinations as a safety risk and links hallucination behavior to spectral dynamics, particularly how activation/attention graph eigenspectra evolve across model scales.
  • Experiments on Whisper variants (Tiny through Large-v3-Turbo) under adversarial stress support the theory, showing intermediate models experience “Structural Disintegration” with a reported 13.4% collapse in Cross-Attention rank.
  • For large models, the authors find a “Compression-Seeking Attractor” regime in which Self-Attention compresses rank (reported -2.34%) and hardens the spectral slope, leading to decoupling from acoustic evidence.
  • Overall, the work provides a mechanistic, spectral explanation for how hallucination-related failure modes can change with scale in ASR transformers.

Abstract

Hallucinations in large ASR models present a critical safety risk. In this work, we propose the \textit{Spectral Sensitivity Theorem}, which predicts a phase transition in deep networks from a dispersive regime (signal decay) to an attractor regime (rank-1 collapse) governed by layer-wise gain and alignment. We validate this theory by analyzing the eigenspectra of activation graphs in Whisper models (Tiny to Large-v3-Turbo) under adversarial stress. Our results confirm the theoretical prediction: intermediate models exhibit \textit{Structural Disintegration} (Regime I), characterized by a 13.4\% collapse in Cross-Attention rank. Conversely, large models enter a \textit{Compression-Seeking Attractor} state (Regime II), where Self-Attention actively compresses rank (-2.34\%) and hardens the spectral slope, decoupling the model from acoustic evidence.