RoleConflictBench: A Benchmark of Role Conflict Scenarios for Evaluating LLMs' Contextual Sensitivity

arXiv cs.CL / 4/20/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

Key Points

  • RoleConflictBench is a new benchmark that evaluates how well LLMs respond to role-conflict social dilemmas by testing contextual sensitivity.
  • The benchmark uses “situational urgency” as a decision-making constraint to create objectively comparable scenarios.
  • It includes over 13,000 realistic cases covering 65 roles across five social domains, generated via a three-stage pipeline with systematically varied urgency.
  • Analysis of 10 LLMs shows significant deviations from the objective contextual baseline: models largely follow learned preferences for particular roles rather than dynamic situational cues.

Abstract

People often encounter role conflicts -- social dilemmas where the expectations of multiple roles clash and cannot be simultaneously fulfilled. As large language models (LLMs) increasingly navigate these social dynamics, a critical research question emerges. When faced with such dilemmas, do LLMs prioritize dynamic contextual cues or the learned preferences? To address this, we introduce RoleConflictBench, a novel benchmark designed to measure the contextual sensitivity of LLMs in role conflict scenarios. To enable objective evaluation within this subjective domain, we employ situational urgency as a constraint for decision-making. We construct the dataset through a three-stage pipeline that generates over 13,000 realistic scenarios across 65 roles in five social domains by systematically varying the urgency of competing situations. This controlled setup enables us to quantitatively measure contextual sensitivity, determining whether model decisions align with the situational contexts or are overridden by the learned role preferences. Our analysis of 10 LLMs reveals that models substantially deviate from this objective baseline. Instead of responding to dynamic contextual cues, their decisions are predominantly governed by the preferences toward specific social roles.