Dropout Robustness and Cognitive Profiling of Transformer Models via Stochastic Inference
arXiv cs.LG / 3/19/2026
📰 NewsIdeas & Deep AnalysisModels & Research
Key Points
- The paper analyzes dropout-induced variability across 19 transformer models using Monte Carlo Dropout with 100 stochastic forward passes per sample to evaluate inference-time robustness.
- It defines dropout robustness as maintaining high accuracy and stable predictions, quantifying stability with the standard deviation of per-run accuracies and a cognitive decomposition into memory and reasoning components.
- In experiments across five dropout configurations, the study performs 95 unique evaluations on 1,000 samples, revealing substantial architectural variation in robustness that is not simply tied to model size.
- Findings show smaller models have highly stable predictions, mid-sized models achieve the best overall accuracy, while larger models excel at memory tasks; importantly, 53% of models suffer severe accuracy degradation under baseline MC Dropout, with task-specific models losing up to 24 percentage points.
- Memory tasks are disproportionately affected by dropout (memory accuracy decreases by 27 percentage points) whereas reasoning loses only 1 percentage point, and 84% of models display memory-biased performance, making this the first comprehensive MC Dropout benchmark for transformers and offering guidance for uncertainty-aware applications.
Related Articles
I Was Wrong About AI Coding Assistants. Here's What Changed My Mind (and What I Built About It).
Dev.to

Interesting loop
Reddit r/LocalLLaMA
Qwen3.5-122B-A10B Uncensored (Aggressive) — GGUF Release + new K_P Quants
Reddit r/LocalLLaMA
A supervisor or "manager" Al agent is the wrong way to control Al
Reddit r/artificial
FeatherOps: Fast fp8 matmul on RDNA3 without native fp8
Reddit r/LocalLLaMA