Trace-Level Analysis of Information Contamination in Multi-Agent Systems

arXiv cs.AI / 5/1/2026

📰 NewsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper studies how uncertainty in multi-agent workflows that reason over heterogeneous artifacts (e.g., PDFs, spreadsheets, slide decks) can “contaminate” agent behavior by altering decomposition and routing decisions.
  • It treats uncertainty as a controllable variable by injecting structured perturbations into artifact-derived representations and then running fixed workflows with comprehensive logging to measure contamination via trace divergence.
  • Across 614 paired runs on 32 GAIA tasks using three language models, the authors find a decoupling where agent traces can diverge substantially while still recovering correct answers, or remain similar while yielding incorrect outputs.
  • The work identifies three contamination manifestation types—silent semantic corruption, behavioral detours with recovery, and combined structural disruption—and links each to distinct control-flow signatures.
  • The study also evaluates operational costs and explains why common verification guardrails often fail, while contributing a taxonomy and a trace-based framework for detecting and localizing contamination across agent interactions.

Abstract

Reasoning over heterogeneous artifacts (PDFs, spreadsheets, slide decks, etc.) increasingly occurs within structured agent workflows that iteratively extract, transform, and reference external information. In these workflows, uncertainty is not merely an input-quality issue: it can redirect decomposition and routing decisions, reshape intermediate state, and produce qualitatively different execution trajectories. We study this phenomenon by treating uncertainty as a controlled variable: we inject structured perturbations into artifact-derived representations, execute fixed workflows under comprehensive logging, and quantify contamination via trace divergence in plans, tool invocations, and intermediate state. Across 614 paired runs on 32 GAIA tasks with three different language models, we find a decoupling: workflows may diverge substantially yet recover correct answers, or remain structurally similar while producing incorrect outputs. We characterize three manifestation types: silent semantic corruption, behavioral detours with recovery, and combined structural disruption and their control-flow signatures (rerouting, extended execution, early termination). We measure operational costs and characterize why commonly used verification guardrails fail to intercept contamination. We contribute (i) a formal taxonomy of contamination manifestations in structured workflows, (ii) a trace-based measurement framework for detecting and localizing contamination across agent interactions, and (iii) empirical evidence with implications for targeted verification, defensive design, and cost control.