Last week researchers at Ox published findings showing that the MCP STDIO transport lets arbitrary command execution slip through unchecked, and that 9 of 11 MCP marketplaces they tested were poisonable. Anthropic's response: STDIO is out of scope for protocol-level fixes, the ecosystem is responsible for operational trust.
Fair — Anthropic donated MCP to the Linux Foundation's Agentic AI Foundation in December 2025 specifically so independent infrastructure could grow around it. But that leaves a real gap for anyone running Claude Code today: how do you know whether an MCP server you're about to invoke is trustworthy?
The Anthropic official registry is pure metadata (license, commit count, popularity). mcp-scorecard.ai scores repos, not behavior. BlueRock runs OWASP-style static scans. None of these ask the one question that actually matters:
Does this MCP server, in real call-time use, work?
So I built a small thing to answer it.
The hook
A zero-config Claude Code hook that does two things on every MCP tool call:
- Before the call — queries a public trust API for that server. If the score is low, Claude shows an inline warning:
⚠ XAIP: "some-server" trust=0.32 (caution, 87 receipts) Risk: high_error_rate
- After the call — emits an Ed25519-signed receipt (success, latency, hashed input/output) to a public aggregator that updates the score.
Install:
npm install -g xaip-claude-hook
xaip-claude-hook install
Next MCP call fires the hook. That's the whole UX.
What a receipt looks like
No raw content leaves your machine — only hashes.
{
"agentDid": "did:web:context7",
"callerDid": "did:key:a1c6cd34…",
"toolName": "resolve-library-id",
"taskHash": "9f3e…", // sha256(input).slice(0,16)
"resultHash": "1b78…", // sha256(response).slice(0,16)
"success": true,
"latencyMs": 668,
"failureType": "",
"timestamp": "2026-04-17T04:24:59.925Z",
"signature": "...", // Ed25519 over canonical JSON (agent key)
"callerSignature": "..." // Ed25519 over canonical JSON (caller key)
}
The aggregator rejects anything that fails signature verification. The trust API computes a Bayesian score across all verified receipts per server, weighted by caller diversity — so one enthusiastic installer can't fake a reputation.
What the scores actually look like right now
Being transparent: the dataset is small. A curl against the live trust API today:
| Server | Trust | Verdict | Receipts | Flag |
|---|---|---|---|---|
| memory | 0.800 | trusted | 112 | — |
| git | 0.775 | trusted | 35 | — |
| sqlite | 0.753 | trusted | 42 | — |
| puppeteer | 0.671 | caution | 32 | high_error_rate |
| context7 | 0.618 | caution | 560 | low_caller_diversity |
| filesystem | 0.579 | caution | 610 | low_caller_diversity |
| playwright | 0.394 | low_trust | 37 | high_error_rate |
| fetch | 0.365 | low_trust | 36 | high_error_rate |
Verify any of these yourself:
curl https://xaip-trust-api.kuma-github.workers.dev/v1/trust/context7
The low_caller_diversity flag on high-volume servers is the single most honest number in that table. It means: I'm the biggest caller right now, and that's exactly the problem this tool is supposed to solve. The flag only clears when independent installers start generating receipts — which is what the npm package is for.
Why this is architecturally different from existing approaches
Every other "MCP trust" project I've seen scores the repository:
- Commit frequency, license, stars, contributor count (mcp-scorecard.ai)
- Static source-code vulnerability scans (BlueRock)
- Registry inclusion as implicit trust (official MCP registry)
These are useful proxies, but none of them tell you whether a server works in practice. A well-maintained repo can have a buggy release; a single-author repo can be rock solid; a newly-forked malicious repo looks identical to the original under static scan.
XAIP scores observed behavior. Every call is a signed attestation. The scoring is Bayesian, so:
- Servers with few receipts get
insufficient_data— no verdict, no warning - High-variance patterns (mixed success/failure) get lower confidence
- The
high_error_rateflag is computed from real response content, classifyingquota exceeded,rate limit,unauthorized, and"isError": trueas failures
This is the same philosophy as OpenSSF Scorecard vs. runtime attestation in supply chain: you want both, but only one of them catches regressions in production.
What's missing / where this could go wrong
I want to be specific about limitations, because "AI trust protocol" posts tend to overpromise:
- ~10 servers, ~1500 receipts total. Small. This post is partly an ask for installers to fix that.
- One aggregator node. Byzantine fault tolerance requires quorum; right now there's one Cloudflare Worker. Quorum needs multiple operators, which is the next milestone.
- Client-side inferSuccess is heuristic. We look at response text for error patterns. False positives and negatives are possible — fetch's 36% error rate might be over-counted (legit 404s shouldn't hurt the server's score) or real.
- Privacy model relies on hashes, not ZK. Inputs and outputs are hashed before transmission, but statistical correlation across taskHashes is possible in principle. Migration to ZK receipt aggregation is a future idea, not a current feature.
-
I personally generated most of the high-volume receipts. The
low_caller_diversityflag you see on context7 and filesystem is me.
Running it yourself
npm install -g xaip-claude-hook
xaip-claude-hook install
xaip-claude-hook status
Open a new Claude Code session. Call any MCP tool. Check:
cat ~/.xaip/hook.log
You'll see lines like:
2026-04-17T04:24:59Z POST context7/resolve-library-id ok=true lat=668ms → 200
And the next time you (or Claude) invoke a low-trust server, the warning shows up inline.
Uninstall is a single command. Keys under ~/.xaip/ persist — delete manually to wipe.
Links
-
npm: https://www.npmjs.com/package/xaip-claude-hook —
npm install -g xaip-claude-hook - Repo: https://github.com/xkumakichi/xaip-protocol
- Hook source: https://github.com/xkumakichi/xaip-protocol/tree/main/clients/claude-code-hook
- Live Trust API: https://xaip-trust-api.kuma-github.workers.dev/v1/trust/context7
- Aggregator: https://xaip-aggregator.kuma-github.workers.dev
Issues, scoring bugs, angry takes — all welcome on GitHub. If you maintain an MCP server and your score looks wrong, I want to hear about it first.



