Anchored Confabulation: Partial Evidence Non-Monotonically Amplifies Confident Hallucination in LLMs
arXiv cs.CL / 4/30/2026
📰 NewsDeveloper Stack & InfrastructureSignals & Early TrendsIdeas & Deep AnalysisTools & Practical UsageModels & Research
Key Points
- The paper identifies a new calibration behavior in LLMs where giving a single confirmed intermediate fact can temporarily increase the model’s confident-wrong answers before later evidence corrects it, a phenomenon the authors call “anchored confabulation.”
- They formalize this effect as Parametric Hallucination Confidence (PHC) and validate it across multiple evidence types, including a causal injection experiment and cross-family scaling results.
- A proposed “Anchoring Threshold Law” predicts how PHC amplification grows with reasoning hop depth, showing measurable effects when multiple intermediate predictions are confirmed.
- The authors demonstrate an application to RAG routing: a LearnedRouter that exploits PHC substantially reduces the oracle performance gap without model fine-tuning, using far fewer labels than earlier RL-based approaches.
- Mitigation experiments suggest that an epistemic-humility prompt and explicit self-rating can reduce PHC spikes, with self-rating outperforming lexical confidence for routing signals.
💡 Insights using this article
This article is featured in our daily AI news digest — key takeaways and action items at a glance.
Related Articles

Black Hat USA
AI Business

Chinese firms face pressure on AI investments as US peers’ spending keeps soaring
SCMP Tech
Building a Local AI Agent (Part 2): Six UX and UI Design Challenges
Dev.to
The Prompt Caching Mistake That's Costing You 70% More Than You Need to Pay
Dev.to
We Built a DNS-Based Discovery Protocol for AI Agents — Here's How It Works
Dev.to