CORE Closed Its Audit Trail. Then Found 18 Engine Gaps It Couldn't See Before.

Dev.to / 5/2/2026

💬 OpinionDeveloper Stack & InfrastructureIdeas & Deep AnalysisIndustry & Market Moves

Read original →

共有:

Key Points

CORE previously lacked its “second log” (the auditable causal trail explaining why changes happened), which made the system difficult to defend in regulated settings.
Band B’s closure required reconciling eight issues across ADRs and coordinated write-path decisions to ensure every link in the causality chain (finding → proposal → execution → new findings) was complete.
After Band B closed, Band D opened with 18 new engine integrity gaps because attribution fidelity can’t be measured until attribution exists.
The piece emphasizes that in GxP-regulated environments, proof matters more than capability: systems must demonstrate authorized intent, boundaries, and a complete audit trail to be defensible.

Six weeks ago I published a post here titled "Your Agent Has Two Logs. One of Them Doesn't Exist Yet."

This week, Band B closed. CORE's second log exists.

Here's what that actually means — and why closing it immediately made things harder.

The two-log problem, briefly

Every autonomous system that touches production code has two logs whether it admits it or not.

Log one: what happened. Files changed, tests ran, commits landed.

Log two: why it happened. What finding triggered what proposal. What approval authorized what execution. What execution caused what file change. What file change produced what new finding.

Log two is the audit trail. In a regulated environment, log two isn't optional — it's the difference between a system you can defend and one you can't.

CORE had log one. Log two was missing.

What Band B actually required

Eight issues. Four ADRs. Seven coordinated write-path decisions — where in the code does attribution get written, in what shape, guaranteed by what gate.

The hard part wasn't the code. It was making the causality chain complete. Every link had to be present:

Finding → which proposal claimed it (and when)
Proposal → which execution consumed it (and what commit resulted)
Execution → which new findings it produced

Miss one link and the chain is decoration, not evidence.

196 commits in April. 25 issues closed. Band B: 8 closed, 0 open.

What happened immediately after

Band D opened with 18 issues.

Not because we introduced regressions. Because closing Band B made the engine's integrity gaps visible in a way they weren't before. You can't measure attribution fidelity until attribution exists. Once it does, you can see exactly where the engine fails to populate it correctly.

This is the convergence principle working as designed. The system gets more capable. It immediately finds more problems with itself. The audit PASS holds — 19 active workers, findings are warnings about modularity, not governance failures. But the work queue doesn't shrink when a band closes. It shifts.

What "GxP-load-bearing" means in practice

I've been building CORE in part for environments like pharmaceutical manufacturing — where an AI system that modifies code or configuration needs to prove it acted within authorized boundaries, on authorized intent, with a complete audit trail.

GxP (Good Practice regulations) doesn't care what your system can do. It cares what your system can prove it did.

Band B is the difference between CORE being a capable tool and CORE being a defensible tool. The second log is what makes it defensible.

What's next

Band D: engine integrity. 18 open issues. The system that now has a complete audit trail needs its engine tightened before those traces are fully trustworthy.

Then Band E: external validation. CORE governing a repository it didn't build.

The second log exists. Now we make sure everything it records is true.

CORE is open source: github.com/DariuszNewecki/CORE

Previous in this series: Your Agent Has Two Logs. One of Them Doesn't Exist Yet.