Group Cognition Learning: Making Everything Better Through Governed Two-Stage Agents Collaboration
arXiv cs.LG / 5/4/2026
📰 NewsModels & Research
Key Points
- The paper identifies two key issues in centralized multimodal learning: modality dominance (the model focuses on easier signals) and spurious modality coupling (overfitting to incidental cross-modal correlations).
- It proposes Group Cognition Learning (GCL), a governed two-stage collaboration framework applied after modality-specific encoders.
- In Stage 1 (Selective Interaction), a Routing Agent proposes interaction paths and an Auditing Agent applies sample-wise gates to boost exchanges with positive marginal predictive gain while suppressing redundant coupling.
- In Stage 2 (Consensus Formation), a Public-Factor Agent maintains a shared explicit factor and an Aggregation Agent forms final predictions using contribution-aware weighting while keeping modality representations as specialized channels.
- Experiments on CMU-MOSI, CMU-MOSEI, and MIntRec show that GCL reduces both dominance and coupling and achieves state-of-the-art results on regression and classification benchmarks, supported by additional analysis experiments.
Related Articles
AnnouncementsBuilding a new enterprise AI services company with Blackstone, Hellman & Friedman, and Goldman Sachs
Anthropic News

Dara Khosrowshahi on replacing Uber drivers — and himself — with AI
The Verge
CLMA Frame Test
Dev.to
Governance and Liability in AI Agents: What I Built Trying to Answer Those Questions
Dev.to

Roundtable chat with Talkie-1930 and Gemma 4 31B
Reddit r/LocalLLaMA