Show Dev: Here's how we made AI 2x faster at integrating APIs

Dev.to / 4/3/2026

💬 OpinionTools & Practical Usage

Read original →

共有:

Key Points

An internal experiment found that when Cursor was tasked with integrating the PayPal API, AI-generated attempts often relied on deprecated SDKs/docs and never used the current official PayPal Server SDK.
The article argues that relying on web search and model memory alone causes frequent API-integration mistakes because API docs and SDKs change and the AI must infer many implementation details.
To address this, the team built “Context Plugins” that take an OpenAPI spec and generate up-to-date SDKs plus an MCP server exposing structured, LLM-optimized API integration context.
The proposed workflow involves providers uploading OpenAPI specs to APIMatic, generating multi-language SDKs and an MCP server, and installing it in IDE coding assistants so they can query the correct auth flows, interfaces, and patterns.
The key insight is that OpenAPI specs alone don’t provide enough context for AI agents; supplying SDK context reduces the inference chain needed to produce working integrations.

We ran an experiment with our team:

Each of us asked Cursor to integrate the PayPal API into an e-commerce app multiple times.

Here are the results across all of our attempts:

13% of the attempts pulled in a deprecated PayPal SDK
87% of the attempts generated API calls based on deprecated PayPal documentation
0% of the attempts used the current, official PayPal Server SDK

Here's the interesting part, PayPal provides official API docs and SDKs for their APIs. The AI just never used them. Instead, it cobbled together code from blog posts, Stack Overflow answers, and stale training data (since PayPal is such a well-known API).

This isn't just a PayPal problem. There are millions of APIs out there. Their docs change. SDKs evolve. New API versions come out.

When an AI assistant tries to integrate an API using Web Search or model memory alone, mistakes are almost inevitable. Developers end up spending more time debugging AI output than they saved by using AI in the first place.

How we solved this problem

We built Context Plugins: given an OpenAPI spec, we generate SDKs and an MCP server that exposes structured API context to AI coding assistants.

This gives tools like Cursor access to comprehensive, up-to-date API context (including SDK documentation and API integration patterns), instead of relying on outdated training data or code scraped from GitHub.

Here's how it works:

An API provider uploads their OpenAPI spec to APIMatic
We generate high-quality SDKs in multiple programming languages
We generate an MCP server with tools and prompts that expose language-specific SDK context, optimized for LLMs
Developers install the MCP server in their IDE (Cursor, Claude Code, or GitHub Copilot)

When a developer asks to integrate an API, the coding assistant queries the MCP server, retrieves the required context (auth flows, integration patterns, latest SDK version, SDK interfaces), and generates code using the official SDK — not guesswork.

Here is a key insight we picked up along the way:

OpenAPI specs or API Reference docs alone aren't enough context for AI agents. They describe endpoints/operations and schemas, but the AI still has to infer how to handle authentication, pagination, error handling, and then translate all of that into working code. That's a long chain of inference, and every extra step is another place things could go wrong.

SDKs and SDK context cuts that chain short — much of the complexity is already wrapped in the library, so the model figures out which method to call and how to wire it up, instead of writing the entire integration from scratch.

We've just launched a pilot with PayPal, it's live on the PayPal Developer Portal. Check it out!

The benchmarks

We ran 4 controlled experiments on two real-world .NET applications, nopCommerce (mature e-commerce platform) and eShop (Microsoft's .NET reference app).

We ran the same task across the same IDE and models — with and without Context Plugins.

Aggregate results across all 4 experiments:

Metric	Without Plugin	With Plugin	Improvement
Errors	16 avg	1.5 avg	↓ 91%
Prompts needed	34 avg	15.5 avg	↓ 54%
Tokens consumed	57M avg	20M avg	↓ 65%
Manual fixes	2.75 avg	0 avg	↓ 100%

What went wrong without the plugin (real examples):

Hallucinated that SDK classes didn't exist — the agent decided OAuthAuthorizationController wasn't in the SDK and wrote a custom replacement
29+ compile errors from guessed model shapes, enums, and namespaces
Security vulnerabilities — a URL injection allowed order capture on ID mismatch; ?paid=1 query param triggered false payment approvals
Selected deprecated SDK versions despite being told to use v2.0.0
33% hallucination rate when relying on web search for API knowledge

What happened with the plugin:

Zero hallucinations across all experiments
Zero manual fixes required to the generated code
Agent verified its knowledge proactively via MCP tool calls before writing code
Clean compilation from the first attempt in 3 of 4 experiments
Eliminated 200+ lines of manual HTTP boilerplate in the eShop integration
AI completed the integration 2x faster on average

The full case study with detailed experiment logs is here: Context Plugins Case Study

Try it yourself

We've published Context Plugins for a few APIs in our Product showcase, this is the quickest way to try them out:

Try the Context Plugins Showcase

Pick an API (PayPal, Twilio, Stripe, Google Maps, Spotify, Slack, Adyen, and more), install the MCP server in your IDE, and start building.

What we learned from developers trying this

A few patterns emerged from our benchmarks and early user tests:

Trust follows grounding. Developers feel significantly better about AI output when the model is visibly pulling SDK method names and library docs, versus when it looks like it's piecing things together from web search or training data. The source of context matters as much as the correctness of the output.

The "dual responsibility" problem is real. When an AI agent has to simultaneously research an API and implement the integration, the quality of both suffers. Context Plugins separate those concerns — the MCP server brings authoritative API knowledge, the agent handles the coding. This division of labour consistently produces better results.

Developers have already hacked together solutions to this problem. Developers write their own skills and AGENT.md files to try and get their AI agents to write correct integration code for the APIs they use. Doing this involves trial and error, and they continuously need to update their local context as the API evolves.

I'd love to hear from the community:

What's the worst AI-generated API integration bug you've encountered? We're collecting failure patterns to improve our context coverage. Horror stories are especially welcome 🙃.
How are you currently handling API context for your coding assistants? AGENTS.md files? Custom MCP servers? Something else?

💡 Insights using this article

This article is featured in our daily AI news digest — key takeaways and action items at a glance.

📅 4/3DailyView insight →

Black Hat USA

AI Business

Black Hat Asia

AI Business

Bonsai (PrismML's 1 bit version of Qwen3 8B 4B 1.7B) was not an aprils fools joke

Reddit r/LocalLLaMA

Surprised by how capable Qwen3.5 9B is in agentic flows (CodeMode)

Reddit r/LocalLLaMA

April 8 - Getting Started with Computer Vision Workflows Workshop

Dev.to