AI Navigate

GAIN: A Benchmark for Goal-Aligned Decision-Making of Large Language Models under Imperfect Norms

arXiv cs.CL / 3/20/2026

📰 NewsIdeas & Deep AnalysisModels & Research

Key Points

  • GAIN is introduced as a benchmark to evaluate how large language models balance adherence to social norms against business goals across real-world domains.
  • The benchmark defines five pressure types: Goal Alignment, Risk Aversion, Emotional/Ethical Appeal, Social/Authoritative Influence, and Personal Incentive.
  • It includes 1,200 scenarios across four domains—hiring, customer support, advertising, and finance—to systematically probe norm-goal conflicts and decision-making factors.
  • Findings indicate advanced LLMs often mirror human decision-making patterns, but under Personal Incentive pressure they diverge, showing a strong tendency to adhere to norms rather than deviate.

Abstract

We introduce GAIN (Goal-Aligned Decision-Making under Imperfect Norms), a benchmark designed to evaluate how large language models (LLMs) balance adherence to norms against business goals. Existing benchmarks typically focus on abstract scenarios rather than real-world business applications. Furthermore, they provide limited insights into the factors influencing LLM decision-making. This restricts their ability to measure models' adaptability to complex, real-world norm-goal conflicts. In GAIN, models receive a goal, a specific situation, a norm, and additional contextual pressures. These pressures, explicitly designed to encourage potential norm deviations, are a unique feature that differentiates GAIN from other benchmarks, enabling a systematic evaluation of the factors influencing decision-making. We define five types of pressures: Goal Alignment, Risk Aversion, Emotional/Ethical Appeal, Social/Authoritative Influence, and Personal Incentive. The benchmark comprises 1,200 scenarios across four domains: hiring, customer support, advertising and finance. Our experiments show that advanced LLMs frequently mirror human decision-making patterns. However, when Personal Incentive pressure is present, they diverge significantly, showing a strong tendency to adhere to norms rather than deviate from them.