Redirected, Not Removed: Task-Dependent Stereotyping Reveals the Limits of LLM Alignments
arXiv cs.CL / 4/6/2026
💬 OpinionIdeas & Deep AnalysisModels & Research
Key Points
- The paper argues that LLM bias cannot be judged from single benchmarks because stereotyping depends on the specific task format, with models shifting behavior between explicit decision-making and implicit association tasks.
- It introduces a hierarchical taxonomy of 9 bias types (including caste, linguistic, and geographic axes) and operationalizes them via 7 evaluation tasks designed to capture both overt and subtle forms of bias.
- Auditing 7 commercial and open-weight LLMs with ~45K prompts shows three consistent patterns: task-dependent bias, asymmetric “alignment” that blocks negative traits for marginalized groups while still assigning positive traits to privileged groups, and particularly strong stereotyping on under-studied bias axes.
- The authors conclude that current alignment practices and single-slice audits can mask representational harm by mischaracterizing how bias manifests across different prompt/task contexts.
Related Articles

Оказывается, эта нейросеть рисует бесплатно. Я узнал случайно.
Dev.to

Big Tech firms are accelerating AI investments and integration, while regulators and companies focus on safety and responsible adoption.
Dev.to

Three-Layer Memory Governance: Core, Provisional, Private
Dev.to

I Researched AI Prompting So You Don’t Have To
Dev.to

Top AI Tools Every Growing Business Should Use in 2026
Dev.to