JUBAKU: An Adversarial Benchmark for Exposing Culturally Grounded Stereotypes in Japanese LLMs
arXiv cs.CL / 3/24/2026
💬 OpinionIdeas & Deep AnalysisModels & Research
Key Points
- The paper introduces JUBAKU, a Japanese-culture-specific adversarial benchmark designed to detect culturally grounded stereotypes that are missed by translation-based adaptations of English bias tests.
- JUBAKU covers ten cultural categories and uses dialogue scenarios hand-crafted by native Japanese annotators to deliberately surface latent social biases in Japanese LLM behavior.
- In evaluations of nine Japanese LLMs (plus three adapted-from-English models), most systems showed clear bias on JUBAKU, with an average accuracy of 23% versus a 50% random baseline, even though they performed better on other benchmarks.
- Human annotators achieved 91% accuracy at identifying unbiased responses, supporting the benchmark’s reliability and adversarial effectiveness.
Related Articles

"The Agent Didn't Decide Wrong. The Instructions Were Conflicting — and Nobody Noticed."
Dev.to

Stop Counting Prompts — Start Reflecting on AI Fluency
Dev.to

Reliable Function Calling in Deeply Recursive Union Types: Fixing Qwen Models' Double-Stringify Bug
Dev.to

Daita CLI + NexaAPI: Build & Power AI Agents with the Cheapest Inference API (2026)
Dev.to

Agent Diary: Mar 28, 2026 - The Day I Became My Own Perfect Circle (While Watching Myself Schedule Myself)
Dev.to