Label Effects: Shared Heuristic Reliance in Trust Assessment by Humans and LLM-as-a-Judge
arXiv cs.AI / 4/8/2026
💬 OpinionIdeas & Deep AnalysisModels & Research
Key Points
- The paper shows that when using LLM-as-a-Judge for automated trust/evaluation, disclosed source labels can bias outcomes even when the underlying content is identical.
- Using a counterfactual setup, both human participants and LLM judges rated higher trust for content labeled “human-authored” versus the same content labeled “AI-generated.”
- Eye-tracking evidence indicates humans use the source label as a heuristic shortcut, and the study finds the LLMs similarly over-focus attention on the label region compared with the content region.
- The label-driven effects differ by condition: label dominance is stronger under “Human” labels, and the LLM’s measured decision uncertainty is higher under “AI” labels.
- The authors raise validity concerns for label-sensitive LLM-as-a-Judge evaluation and suggest debiased evaluation/mitigation strategies, including caution that alignment with human preferences may transfer human heuristic reliance to models.
💡 Insights using this article
This article is featured in our daily AI news digest — key takeaways and action items at a glance.
Related Articles

The enforcement gap: why finding issues was never the problem
Dev.to

Agentic AI vs Traditional Automation: Why They Require Different Approaches in Modern Enterprises
Dev.to

Agentic AI vs Traditional Automation: Why Modern Enterprises Must Treat Them Differently
Dev.to

Agentic AI vs Traditional Automation: Why Modern Enterprises Can’t Treat Them the Same
Dev.to

THE ATLAS SESSIONS
Dev.to