[P] Cold Validation: Open-source system where one AI agent audits another with zero shared context

Reddit r/MachineLearning / 3/25/2026

📰 NewsDeveloper Stack & InfrastructureSignals & Early TrendsTools & Practical Usage

共有:

Key Points

The article announces an open-source “cold validation” architecture for independently verifying AI agent outputs using strict separation between a Builder agent and a Reviewer agent.
It uses phase-gated orchestration where the Builder (Claude Code) generates plans and code, while the Reviewer (Codex CLI) audits only produced artifacts without access to the Builder’s reasoning.
The Reviewer runs in a filesystem-isolated environment (e.g., a temporary directory) to prevent access to the broader repository context, reducing leakage and bias.
It tracks findings across validation rounds using durable fingerprints and uses a controller to reconcile verdicts against blocking findings, supported by 35 mechanical tests.
The system is released under Apache 2.0, with links provided to the GitHub repository and a deeper technical write-up.

We released an open-source architecture for independent AI agent verification. The core idea: the agent that built something should never review it. Cold validation uses two agents with strict separation - Builder (Claude Code) produces plans and code - Reviewer (Codex CLI) audits only artifacts — never sees reasoning - An orchestrator enforces phase gates and convergence The reviewer runs filesystem-isolated (temp dir, no repo access). Findings are tracked with durable fingerprints across rounds. The controller independently reconciles verdicts against blocking findings. Apache 2.0. 35 mechanical tests.

GitHub: https://github.com/raxe-ai/cold-validation-architecture

Deep dive: https://raxe.ai/labs/cold-validation

submitted by /u/cyberamyntas
[link] [comments]

The Security Gap in MCP Tool Servers (And What I Built to Fix It)

Dev.to

Big Tech firms are accelerating AI investments and integration, while regulators and companies focus on safety and responsible adoption.

Dev.to

I made a new programming language to get better coding with less tokens.

Dev.to

RSA Conference 2026: The Week Vibe Coding Security Became Impossible to Ignore

Dev.to

Adversarial AI framework reveals mechanisms behind impaired consciousness and a potential therapy

Reddit r/artificial

[P] Cold Validation: Open-source system where one AI agent audits another with zero shared context

Key Points

Related Articles

The Security Gap in MCP Tool Servers (And What I Built to Fix It)

Big Tech firms are accelerating AI investments and integration, while regulators and companies focus on safety and responsible adoption.

I made a new programming language to get better coding with less tokens.

RSA Conference 2026: The Week Vibe Coding Security Became Impossible to Ignore

Adversarial AI framework reveals mechanisms behind impaired consciousness and a potential therapy

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer