BANGLASOCIALBENCH: A Benchmark for Evaluating Sociopragmatic and Cultural Alignment of LLMs in Bangladeshi Social Interaction

arXiv cs.CL / 3/18/2026

📰 NewsIdeas & Deep AnalysisModels & Research

共有:

Key Points

BANGLASOCIALBENCH is the first benchmark designed to evaluate sociopragmatic competence in Bangla by focusing on context-dependent language use rather than factual recall.
It spans three domains—Bangla Address Terms, Kinship Reasoning, and Social Customs—and comprises 1,719 culturally grounded instances written and verified by native Bangla speakers.
Twelve contemporary LLMs were evaluated in a zero-shot setting, revealing systematic patterns of cultural misalignment such as overly formal address forms and failure to recognize multiple socially acceptable pronouns.
The results show sociopragmatic failures are structured rather than random, underscoring persistent limitations in how current LLMs infer and apply culturally appropriate language in realistic Bangladeshi social interactions.

Abstract

Large Language Models have demonstrated strong multilingual fluency, yet fluency alone does not guarantee socially appropriate language use. In high-context languages, communicative competence requires sensitivity to social hierarchy, relational roles, and interactional norms that are encoded directly in everyday language. Bangla exemplifies this challenge through its three-tiered pronominal system, kinship-based addressing, and culturally embedded social customs. We introduce BANGLASOCIALBENCH, the first benchmark designed to evaluate sociopragmatic competence in Bangla through context-dependent language use rather than factual recall. The benchmark spans three domains: Bangla Address Terms, Kinship Reasoning, and Social Customs, and consists of 1,719 culturally grounded instances written and verified by native Bangla speakers. We evaluate twelve contemporary LLMs in a zero-shot setting and observe systematic patterns of cultural misalignment. Models frequently default to overly formal address forms, fail to recognize multiple socially acceptable address pronouns, and conflate kinship terminology across religious contexts. Our findings show that sociopragmatic failures are often structured and non-random, revealing persistent limitations in how current LLMs infer and apply culturally appropriate language use in realistic Bangladeshi social interactions.

State of MCP Security 2026: We Scanned 15,923 AI Tools. Here's What We Found.

Dev.to

Data Augmentation Using GANs

Dev.to

Building Safety Guardrails for LLM Customer Service That Actually Work in Production

Dev.to

The New AI Agent Primitive: Why Policy Needs Its Own Language (And Why YAML and Rego Fall Short)

Dev.to

The Digital Paralegal: Amplifying Legal Teams with a Copilot Co-Worker

Dev.to

BANGLASOCIALBENCH: A Benchmark for Evaluating Sociopragmatic and Cultural Alignment of LLMs in Bangladeshi Social Interaction

Key Points

Abstract

Related Articles

State of MCP Security 2026: We Scanned 15,923 AI Tools. Here's What We Found.

Data Augmentation Using GANs

Building Safety Guardrails for LLM Customer Service That Actually Work in Production

The New AI Agent Primitive: Why Policy Needs Its Own Language (And Why YAML and Rego Fall Short)

The Digital Paralegal: Amplifying Legal Teams with a Copilot Co-Worker

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer