Improving Role Consistency in Multi-Agent Collaboration via Quantitative Role Clarity

arXiv cs.AI / 4/6/2026

📰 NewsSignals & Early TrendsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper targets a key failure mode in LLM-driven multi-agent systems where agents disobey role specifications and overstep their assigned responsibilities.
  • It introduces a quantitative “role clarity” formulation using a role assignment matrix based on semantic similarity between agent behavior trajectories and role descriptions, then measuring alignment via a role clarity matrix and its Frobenius norm.
  • The authors use this role clarity matrix as a regularizer during lightweight fine-tuning to keep agents behaviorally consistent with their roles.
  • Experiments on the ChatDev multi-agent system show large reductions in role overstepping rates and notable improvements in role clarity and task success rates across Qwen and Llama.

Abstract

In large language model (LLM)-driven multi-agent systems, disobey role specification (failure to adhere to the defined responsibilities and constraints of an assigned role, potentially leading to an agent behaving like another) is a major failure mode \cite{DBLP:journals/corr/abs-2503-13657}. To address this issue, in the present paper, we propose a quantitative role clarity to improve role consistency. Firstly, we construct a role assignment matrix S(\phi)=[s_{ij}(\phi)], where s_{ij}(\phi) is the semantic similarity between the i-th agent's behavior trajectory and the j-th agent's role description. Then we define role clarity matrix M(\phi) as \text{softmax}(S(\phi))-I, where \text{softmax}(S(\phi)) is a row-wise softmax of S(\phi) and I is the identity matrix. The Frobenius norm of M(\phi) quantifies the alignment between agents' role descriptions and their behaviors trajectory. Moreover, we employ the role clarity matrix as a regularizer during lightweight fine-tuning to improve role consistency, thereby improving end-to-end task performance. Experiments on the ChatDev multi-agent system show that our method substantially improves role consistency and task performance: with Qwen and Llama, the role overstepping rate decreases from 46.4\% to 8.4\% and from 43.4\% to 0.2\%, respectively, and the role clarity score increases from 0.5328 to 0.9097 and from 0.5007 to 0.8530, respectively, the task success rate increases from 0.6769 to 0.6909 and from 0.6174 to 0.6763, respectively.