An Agentic Evaluation Architecture for Historical Bias Detection in Educational Textbooks
arXiv cs.CL / 4/10/2026
💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research
Key Points
- The paper proposes an agentic evaluation architecture to detect historical bias in educational textbooks at scale using a multimodal screening agent, a five-agent heterogeneous jury, and a meta-agent that synthesizes verdicts and escalates to humans when needed.
- A key contribution is a Source Attribution Protocol that separates the textbook narrative from quoted historical sources to reduce systematic false positives common in single-model evaluators.
- In experiments on Romanian upper-secondary history textbooks (270 excerpts), the agentic approach classified 83.3% as pedagogically acceptable, substantially improving over a zero-shot baseline (severity 2.9/7 vs. 5.4/7).
- In blind human comparisons (18 evaluators, 54 comparisons), the Independent Deliberation setup was preferred 64.8% of the time over both heuristic and zero-shot baselines.
- The authors argue the method is cost-effective (about $2 per textbook), positioning agentic evaluation as viable decision-support for educational governance.



