Do LLMs Know What They Know? Measuring Metacognitive Efficiency with Signal Detection Theory
arXiv cs.CL / 3/27/2026
💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research
Key Points
- The paper argues that common LLM confidence calibration metrics (e.g., ECE, Brier score) mix two abilities—Type-1 sensitivity (how much the model knows) and Type-2 metacognitive sensitivity (how well it knows what it knows).
- It proposes an evaluation framework using Type-2 Signal Detection Theory, introducing meta-d' and an M-ratio to separately measure metacognitive capacity and metacognitive efficiency.
- Experiments on four LLMs across 224,000 factual QA trials show large differences in metacognitive efficiency even when Type-1 sensitivity is similar, including cases where a model ranks highest by d' but lowest by M-ratio.
- The study finds metacognitive efficiency is domain-specific and can be shifted by temperature changes, indicating that confidence policy (Type-2 criterion) can move independently of underlying metacognitive capacity for some models.
- It reports that AUROC_2 and M-ratio can produce fully inverted model rankings, suggesting these metrics answer fundamentally different evaluation questions, with implications for model selection and deployment.
広告
💡 Insights using this article
This article is featured in our daily AI news digest — key takeaways and action items at a glance.
Related Articles

Got My 39-Agent System Audited Live. Here's What the Maturity Scorecard Revealed.
Dev.to

The Redline Economy
Dev.to

$500 GPU outperforms Claude Sonnet on coding benchmarks
Dev.to

From Scattershot to Sniper: AI for Hyper-Personalized Media Lists
Dev.to

The LiteLLM Supply Chain Attack: A Wake-Up Call for AI Infrastructure
Dev.to