Pay Only For The Work That Actually Ships.

Justifying an AI coding agent's ROI to finance has always been the hard part. Cognition's new Devin guarantee — refund the gap in usage credits, up to $10M per contract — quietly rewrites the enterprise procurement conversation.

AI Navigate Editorial2026.07.046 min read

The Problem

"Will this pay for itself?" — no one could prove it on paper.

Over the last two years, AI coding agents have breezed through proof-of-concept budgets. But scaling from a pilot to a company-wide rollout hits the same wall every time: nobody can convince procurement that the dollars going in will produce a proportional amount of engineering value coming out.

Traditional software is easy — seats × price, capped losses, done. AI agents behave differently. They spin up autonomous work, they consume compute as a variable cost, and the "value produced" is notoriously hard to convert into dollars after the fact. CFOs ask, "For that $10M investment, how many dollars of engineering work were saved?" — and teams answer with proxy metrics: PRs opened, review hours reduced, lint fixes automated.

The problem with proxies is that they are soft. There is no industry-standard way to measure "value uniquely attributable to the agent" at a level of rigor that survives finance scrutiny. The bigger the buyer, the more cautious they get: three engineers, three months, one repo. The result — the very organizations that could benefit most from a fleet of Devin agents were the ones most stuck in permanent pilot mode.

What Changed

Cognition eats the gap.

Cognition returns the gap between contract value and measured engineering value as Devin usage credits.

$10M

Refund cap (credits)

Enterprise

Eligible segment

Contract term

Evaluation window

Credits

Refund form (not cash)

Note the shape of the promise: the refund is in usage credits, not cash. Cognition's real exposure is next year's inference cost, not lost MRR. Even so, publicly capping the guarantee at up to $10M in credits per contract is a deliberate signal — a declaration to enterprise procurement that Cognition is willing to stand behind Devin's output at a size finance actually notices.

The Mechanism

How the "value" gets measured.

The mechanic collapses into four steps: deploy, measure, compute the gap, refund in credits. Each step is designed to be legible to both Cognition and the customer's finance team.

Deploy — set the baseline

At contract signing, both sides agree on which teams, which repositories, and what counts as "engineering value" (merged PRs, tests added, refactor volume). The measurement surface is fixed up front — no post-hoc goalpost shifts.

Measure — track the work

Throughout the contract term, Devin's contributions are tracked automatically. Autonomous PRs are distinguished from human PRs, follow-up commits and review outcomes are folded in, and the number reflects net contribution, not gross activity.

Gap — do the math

Contract value is compared to measured value, both in dollars. If measured value falls short, the difference — the "gap" — is fixed. Only now does the phrase "you overpaid by X" have a defensible number attached.

Credit — return it

The gap is returned as Devin usage credits, applied against the next contract year's consumption. The cap is $10M per contract. Cash never leaves Cognition, but the customer keeps buying agent time on Cognition's dime until the gap is closed.

Who Feels It

Who this guarantee actually helps.

Enterprise procurement

The biggest winner. On a nine-figure request, one sentence about credit-backed downside protection cuts weeks of back-and-forth with finance. The CFO deck writes itself.

Mid-market buyers

Contracts rarely reach the $10M ceiling, so the cap itself is theoretical. But the signal — that Cognition is a vendor that publicly guarantees value — pulls forward vendor-selection decisions that were leaning "wait and see."

Solo devs and small teams

Out of scope. The guarantee is Enterprise-only; usage-based plans continue as before. Worth watching whether the tightening focus on enterprise leaves smaller customers behind on new features and support tiers.

Old vs New

The direction of risk just flipped.

Legacy AI adoption	Devin value guarantee
Pay for what you consume — the customer eats the risk	Under-deliver and the vendor eats the gap in credits
Start with a 3-person, 3-month pilot	Sign the enterprise-wide contract from day one
Justify ROI with proxy metrics (PR counts, hours saved)	Justify ROI with a measured value figure in dollars
Unused seats are sunk cost	Under-delivery becomes usage credit for next year

Under-deliver, and
the credits come back.

Frontier

What "outcome-based pricing" means for the agent industry.

This is one of the earliest genuine outcome-based pricing structures in the AI agent industry. A market conditioned on SaaS seat licenses and usage meters is being nudged toward a very different premise: charge for value produced, not compute consumed.

It is a bold posture for Cognition. If Devin under-delivers, a slice of revenue silently rolls into next year as credit obligation. But turn the argument around: the fact that Cognition can offer such a cap at all is a message to competitors — "we are confident enough in Devin's output to put our own margins at risk." That kind of signal reshapes vendor selection, whether or not any refund is ever paid out.

Whether this becomes an industry norm depends on the next few quarters. If even a handful of Fortune 500 buyers start writing "value guarantee" into their vendor RFPs, agent pricing will drift toward outcome-based structures over the next three to five years. The word procurement uses will slowly change — from "seats" to "value delivered." A quiet tectonic shift, one guarantee at a time.