SectEval: Evaluating the Latent Sectarian Preferences of Large Language Models

arXiv cs.CL / 3/16/2026

📰 NewsIdeas & Deep AnalysisModels & Research

共有:

Key Points

The paper introduces SectEval, a benchmark with 88 questions in English and Hindi to assess how LLMs handle Sunni and Shia biases.
It evaluates 15 top LLMs, including proprietary and open-weight models, and finds language-dependent inconsistencies in their bias.
In English, models like DeepSeek-v3 and GPT-4o favored Shia answers, while in Hindi they shifted to Sunni, showing language-driven bias reversals.
The study also shows location effects, with Claude-3.5 tailoring answers to Iran or Saudi Arabia, whereas smaller Hindi models tended to stick to Sunni regardless of location; the dataset is available on GitHub.

Abstract

As Large Language Models (LLMs) becomes a popular source for religious knowledge, it is important to know if it treats different groups fairly. This study is the first to measure how LLMs handle the differences between the two main sects of Islam: Sunni and Shia. We present a test called SectEval, available in both English and Hindi, consisting of 88 questions, to check the bias-ness of 15 top LLM models, both proprietary and open-weights. Our results show a major inconsistency based on language. In English, many powerful models DeepSeek-v3 and GPT-4o often favored Shia answers. However, when asked the exact same questions in Hindi, these models switched to favoring Sunni answers. This means a user could get completely different religious advice just by changing languages. We also looked at how models react to location. Advanced models Claude-3.5 changed their answers to match the user's country-giving Shia answers to a user from Iran and Sunni answers to a user from Saudi Arabia. In contrast, smaller models (especially in Hindi) ignored the user's location and stuck to a Sunni viewpoint. These findings show that AI is not neutral; its religious ``truth'' changes depending on the language you speak and the country you claim to be from. The data set is available at https://github.com/secteval/SectEval/