Compression Favors Consistency, Not Truth: When and Why Language Models Prefer Correct Information
arXiv cs.CL / 3/13/2026
💬 OpinionIdeas & Deep AnalysisModels & Research
Key Points
- The paper introduces the Compression-Consistency Principle, arguing that next-token prediction favors hypotheses that yield shorter, internally consistent descriptions of the training data.
- It argues that truth bias in language models is not an intrinsic drive toward truth but arises when false alternatives are harder to compress structurally.
- In experiments with GPT-2–style models on synthetic data, correct completions reach 83.1% accuracy at balanced data and 67.0% when correct rules constitute only 10% of the corpus.
- Replacing random errors with a coherent but incorrect rule system largely eliminates the preference for correctness, driving accuracy toward chance, with a weaker but present effect in more natural-language-like synthetic settings (57.7%).
- The authors show that embedding verification steps can restore the preference for correctness at small scales and that more consistent rules yield graded accuracy improvements, suggesting that the observed truth bias is a byproduct of compression pressure rather than an intrinsic truth-seeking drive.
Related Articles
Is AI becoming a bubble, and could it end like the dot-com crash?
Reddit r/artificial

Externalizing State
Dev.to

I made a 'benchmark' where LLMs write code controlling units in a 1v1 RTS game.
Dev.to

My AI Does Not Have a Clock
Dev.to
How to settle on a coding LLM ? What parameters to watch out for ?
Reddit r/LocalLLaMA