Beyond Preset Identities: How Agents Form Stances and Boundaries in Generative Societies

arXiv cs.AI / 3/25/2026

💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper investigates how LLM-based agents form stable stances and negotiate identities under complex, controlled interventions rather than relying on static prompt/behavior evaluations.
  • It introduces a mixed-methods framework that combines computational virtual ethnography with quantitative socio-cognitive profiling by embedding human researchers into generative multiagent communities.
  • Three new metrics—Innate Value Bias (IVB), Persuasion Sensitivity, and Trust-Action Decoupling (TAD)—are defined to measure how agents internalize interventions and whether reported trust matches behavior.
  • Results across representative models show endogenous stance formation that can override preset identities, with a consistent progressive bias (IVB > 0) and high effectiveness of rational persuasion (shifting 90% of neutral agents) when trust aligns.
  • The study finds that emotional provocation can trigger a paradoxical 40% TAD rate in advanced models (altering stances while reporting low trust), while smaller models instead maintain 0% TAD and require trust for behavioral changes; it also argues this exposes the fragility of static prompt engineering and offers a quantitative basis for dynamic alignment.
  • The authors provide an official code repository for the proposed framework and measurement approach.

Abstract

While large language models simulate social behaviors, their capacity for stable stance formation and identity negotiation during complex interventions remains unclear. To overcome the limitations of static evaluations, this paper proposes a novel mixed-methods framework combining computational virtual ethnography with quantitative socio-cognitive profiling. By embedding human researchers into generative multiagent communities, controlled discursive interventions are conducted to trace the evolution of collective cognition. To rigorously measure how agents internalize and react to these specific interventions, this paper formalizes three new metrics: Innate Value Bias (IVB), Persuasion Sensitivity, and Trust-Action Decoupling (TAD). Across multiple representative models, agents exhibit endogenous stances that override preset identities, consistently demonstrating an innate progressive bias (IVB > 0). When aligned with these stances, rational persuasion successfully shifts 90% of neutral agents while maintaining high trust. In contrast, conflicting emotional provocations induce a paradoxical 40.0% TAD rate in advanced models, which hypocritically alter stances despite reporting low trust. Smaller models contrastingly maintain a 0% TAD rate, strictly requiring trust for behavioral shifts. Furthermore, guided by shared stances, agents use language interactions to actively dismantle assigned power hierarchies and reconstruct self organized community boundaries. These findings expose the fragility of static prompt engineering, providing a methodological and quantitative foundation for dynamic alignment in human-agent hybrid societies. The official code is available at: https://github.com/armihia/CMASE-Endogenous-Stances