CoE: Collaborative Entropy for Uncertainty Quantification in Agentic Multi-LLM Systems

arXiv cs.AI / 3/31/2026

💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research

共有:

Key Points

The paper identifies a limitation in current uncertainty estimation for multi-LLM systems: most methods measure uncertainty within individual models but fail to capture semantic disagreement across models in the collaboration.
It introduces Collaborative Entropy (CoE), a unified information-theoretic metric defined over a shared semantic cluster space that combines intra-model semantic entropy with inter-model divergence to the ensemble mean.
CoE is positioned as a system-level uncertainty measure (not a weighted ensemble predictor) designed to quantify collaborative confidence and disagreement among multiple LLMs.
The authors analyze key theoretical properties of CoE, including non-negativity and zero uncertainty under perfect semantic consensus, and study behavior in edge cases like per-model collapse to delta distributions.
Experiments on TriviaQA and SQuAD using LLaMA-3.1-8B-Instruct, Qwen-2.5-7B-Instruct, and Mistral-7B-Instruct show CoE improves uncertainty estimation versus standard entropy/divergence baselines, with larger gains as more heterogeneous models are added, and demonstrates a training-free CoE-guided coordination heuristic.

Abstract

Uncertainty estimation in multi-LLM systems remains largely single-model-centric: existing methods quantify uncertainty within each model but do not adequately capture semantic disagreement across models. To address this gap, we propose Collaborative Entropy (CoE), a unified information-theoretic metric for semantic uncertainty in multi-LLM collaboration. CoE is defined on a shared semantic cluster space and combines two components: intra-model semantic entropy and inter-model divergence to the ensemble mean. CoE is not a weighted ensemble predictor; it is a system-level uncertainty measure that characterizes collaborative confidence and disagreement. We analyze several core properties of CoE, including non-negativity, zero-value certainty under perfect semantic consensus, and the behavior of CoE when individual models collapse to delta distributions. These results clarify when reducing per-model uncertainty is sufficient and when residual inter-model disagreement remains. We also present a simple CoE-guided, training-free post-hoc coordination heuristic as a practical application of the metric. Experiments on \textit{TriviaQA} and \textit{SQuAD} with LLaMA-3.1-8B-Instruct, Qwen-2.5-7B-Instruct, and Mistral-7B-Instruct show that CoE provides stronger uncertainty estimation than standard entropy- and divergence-based baselines, with gains becoming larger as additional heterogeneous models are introduced. Overall, CoE offers a useful uncertainty-aware perspective on multi-LLM collaboration.

Black Hat Asia

AI Business

[D] How does distributed proof of work computing handle the coordination needs of neural network training?

Reddit r/MachineLearning

Claude Code's Entire Source Code Was Just Leaked via npm Source Maps — Here's What's Inside

Dev.to

BYOK is not just a pricing model: why it changes AI product trust

Dev.to

AI Citation Registries and Identity Persistence Across Records

Dev.to

CoE: Collaborative Entropy for Uncertainty Quantification in Agentic Multi-LLM Systems

Key Points

Abstract

Related Articles

Black Hat Asia

[D] How does distributed proof of work computing handle the coordination needs of neural network training?

Claude Code's Entire Source Code Was Just Leaked via npm Source Maps — Here's What's Inside

BYOK is not just a pricing model: why it changes AI product trust

AI Citation Registries and Identity Persistence Across Records

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer