AI Navigate

OrgForge: A Multi-Agent Simulation Framework for Verifiable Synthetic Corporate Corpora

arXiv cs.CL / 3/17/2026

📰 NewsDeveloper Stack & InfrastructureTools & Practical UsageModels & Research

Key Points

  • OrgForge is an open-source multi-agent simulation framework that enables the creation of verifiable synthetic corpora to evaluate retrieval-augmented generation (RAG) pipelines.
  • It enforces a deterministic physics-cognition boundary using a Python-based SimEvent ground truth bus and an actor-local clock to ensure consistent timestamps across Slack threads, Jira tickets, Confluence pages, Git pull requests, and emails.
  • The system interleaves multiple artifact types and links them to a shared immutable event log, with subsystems for cross-artifact evidence graphs and a recurrence detector to identify repeated failure modes.
  • OrgForge is MIT-licensed and supports configurable N-day simulations with features like gated email routing and probabilistic drop simulation to model organizational dynamics independently of LLMs.

Abstract

Evaluating retrieval-augmented generation (RAG) pipelines requires corpora where ground truth is knowable, temporally structured, and cross-artifact properties that real-world datasets rarely provide cleanly. Existing resources such as the Enron corpus carry legal ambiguity, demographic skew, and no structured ground truth. Purely LLM-generated synthetic data solves the legal problem but introduces a subtler one: the generating model cannot be prevented from hallucinating facts that contradict themselves across documents.We present OrgForge, an open-source multi-agent simulation framework that enforces a strict physics-cognition boundary: a deterministic Python engine maintains a SimEvent ground truth bus; large language models generate only surface prose, constrained by validated proposals. An actor-local clock enforces causal timestamp correctness across all artifact types, eliminating the class of timeline inconsistencies that arise when timestamps are sampled independently per document. We formalize three graph-dynamic subsystems stress propagation via betweenness centrality, temporal edge-weight decay, and Dijkstra escalation routing that govern organizational behavior independently of any LLM. Running a configurable N-day simulation, OrgForge produces interleaved Slack threads, JIRA tickets, Confluence pages, Git pull requests, and emails, all traceable to a shared, immutable event log. We additionally describe a causal chain tracking subsystem that accumulates cross-artifact evidence graphs per incident, a hybrid reciprocal-rank-fusion recurrence detector for identifying repeated failure classes, and an inbound/outbound email engine that routes vendor alerts, customer complaints, and HR correspondence through gated causal chains with probabilistic drop simulation. OrgForge is available under the MIT license.