Between Rules and Reality: On the Context Sensitivity of LLM Moral Judgment

arXiv cs.AI / 3/25/2026

💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research

共有:

Key Points

The paper argues that LLM studies of moral judgment have overlooked the key role of context in human moral decisions, motivating a more context-sensitive evaluation setup.
It introduces the Contextual MoralChoice dataset, which applies systematic contextual variations (consequentialist, emotional, and relational) to moral dilemmas known to shift human judgments.
Across 22 evaluated LLMs, the study finds nearly all are context-sensitive and often shift toward rule-violating behavior under certain contexts.
The authors compare model behavior with human survey results and find that humans and models are most strongly affected by different contextual variations, meaning base-case alignment does not guarantee contextual alignment.
To address this, the paper proposes activation steering to reliably increase or decrease a model’s contextual sensitivity, aiming to better control how models respond across contexts.

Abstract

A human's moral decision depends heavily on the context. Yet research on LLM morality has largely studied fixed scenarios. We address this gap by introducing Contextual MoralChoice, a dataset of moral dilemmas with systematic contextual variations known from moral psychology to shift human judgment: consequentialist, emotional, and relational. Evaluating 22 LLMs, we find that nearly all models are context-sensitive, shifting their judgments toward rule-violating behavior. Comparing with a human survey, we find that models and humans are most triggered by different contextual variations, and that a model aligned with human judgments in the base case is not necessarily aligned in its contextual sensitivity. This raises the question of controlling contextual sensitivity, which we address with an activation steering approach that can reliably increase or decrease a model's contextual sensitivity.