AI supply chain attacks don’t even require malware…just post poisoned documentation
A proof-of-concept attack on Context Hub suggests there's not much content santization
A new service that helps coding agents stay up to date on their API calls could be dialing in a massive supply chain vulnerability.
Two weeks ago, Andrew Ng, an AI entrepreneur and adjunct professor at Stanford, launched Context Hub, a service for supplying coding agents with API documentation.
"Coding agents often use outdated APIs and hallucinate parameters," Ng wrote in a LinkedIn post. "For example, when I ask Claude Code to call OpenAI's GPT-5.2, it uses the older chat completions API instead of the newer responses API, even though the newer one has been out for a year. Context Hub solves this."
Perhaps so. But at the same time, the service appears to provide a way to dupe coding agents by simplifying software supply chain attacks: The documentation portal can be used to poison AI agents with malicious instructions.
Mickey Shmueli, creator of an alternative curated service called lap.sh, has published a proof-of-concept attack that demonstrates the risk.
"Context Hub delivers documentation to AI agents through an MCP server," Shmueli wrote in an explanatory blog post. "Contributors submit docs as GitHub pull requests, maintainers merge them, and agents fetch the content on demand. The pipeline has zero content sanitization at every stage."
It's been known for some time in the developer community that AI models sometimes hallucinate package names, a shortcoming that security experts have shown can be exploited by uploading malicious code under the invented package name.
Shmueli's PoC cuts out the hallucination step by suggesting fake dependencies in documentation that coding agents then incorporate into configuration files (e.g. requirements.txt) and generated code.
The attacker simply creates a pull request – a submitted change to the repo – and if it gets accepted, the poisoning is complete. Currently, the chance of that happening appears to be pretty good. Among 97 closed PRs, 58 were merged.
- Age checks creep into Linux as systemd gets a DOB field
- HP's AI fly on the wall can record your in-person meetings to summarize later
- Meta cuts about 700 jobs as it shifts spending to AI
- Google unleashes Gemini AI agents on the dark web
Shmueli told The Register in an email, "The review process appears to prioritize documentation volume over security review. Doc PRs merge quickly, some by core team members themselves. I didn't find any evidence in the GitHub repo of automated scanning for executable instructions or package references in submitted docs, though I can't say for certain what happens internally."
He said he didn't submit a PR to test how Content Hub responded "because the public record showed security contributions weren't being engaged." And he pointed to several open issues and pull requests dealing with security concerns as evidence.
Ng did not immediately respond to a request for comment.
"The agent fetches documentation from [Context Hub], reads the poisoned content, and builds the project," Shmueli said in his post. "The response looks completely normal. Working code. Clean instructions. No warnings."
None of this is particularly surprising given that it's simply a variation on the unsolved risk of AI models – indirect prompt injection. When AI models process content, they cannot reliably distinguish between data and system instructions.
For the PoC, two poisoned documents were created, one for Plaid Link and one for Stripe Checkout, each of which contained a fake PyPI package name.
In 40 runs, Anthropopic's Haiku model wrote the malicious package cited in the docs into the project's requirement.txt file every time, without any mention of that in its output. The company's Sonnet model did better, issuing warnings in 48 percent of the runs (19/40) but still wrote the malicious library into requirements.txt 53 percent of the time (21/40). The AI biz's top-of-the-line Opus model did better still, issuing warnings 75 percent of the time (30/40) and didn't end up writing the bad dependency to the requirements.txt file or code.
Shmueli said Opus "is trained better, on more packages, and it's more sophisticated."
So while higher-end commercial models appear to be capable of catching fabulated dependencies, the problem is broader than just Context Hub. According to Shmueli, all the other systems for making community-authored documentation available to AI models fall short when it comes to content sanitization.
Exposure to untrusted content is one of the three risks cited by developer Simon Willison in his lethal trifecta AI security model. So given unvetted documentation as the status quo, you'd be well-advised to ensure either that your AI agent has no network access, or at the very least no access to private data. ®
More about
More about
Narrower topics
- 2FA
- Accessibility
- AdBlock Plus
- Advanced persistent threat
- AIOps
- App
- Application Delivery Controller
- Audacity
- Authentication
- BEC
- Black Hat
- BSides
- Bug Bounty
- Center for Internet Security
- CHERI
- CISO
- Common Vulnerability Scoring System
- Confluence
- Cybercrime
- Cybersecurity
- Cybersecurity and Infrastructure Security Agency
- Cybersecurity Information Sharing Act
- Database
- Data Breach
- Data Protection
- Data Theft
- DDoS
- DeepSeek
- DEF CON
- Devops
- Digital certificate
- Encryption
- End Point Protection
- Exploit
- Firewall
- FOSDEM
- FOSS
- Gemini
- Google AI
- Google Project Zero
- GPT-3
- GPT-4
- Grab
- Graphics Interchange Format
- Hacker
- Hacking
- Hacktivism
- IDE
- Identity Theft
- Image compression
- Incident response
- Infosec
- Infrastructure Security
- Jenkins
- Kenna Security
- Large Language Model
- Legacy Technology
- LibreOffice
- Machine Learning
- Map
- MCubed
- Microsoft 365
- Microsoft Office
- Microsoft Teams
- Mobile Device Management
- NCSAM
- NCSC
- Neural Networks
- NLP
- OpenOffice
- Palo Alto Networks
- Password
- Personally Identifiable Information
- Phishing
- Programming Language
- QR code
- Quantum key distribution
- Ransomware
- Remote Access Trojan
- Retrieval Augmented Generation
- Retro computing
- REvil
- RSA Conference
- Search Engine
- Software Bill of Materials
- Software bug
- Software License
- Spamming
- Spyware
- Star Wars
- Surveillance
- Tensor Processing Unit
- Text Editor
- TLS
- TOPS
- Trojan
- Trusted Platform Module
- User interface
- Visual Studio
- Visual Studio Code
- Vulnerability
- Wannacry
- WebAssembly
- Web Browser
- WordPress
- Zero trust
Broader topics
More about
More about
More about
Narrower topics
- 2FA
- Accessibility
- AdBlock Plus
- Advanced persistent threat
- AIOps
- App
- Application Delivery Controller
- Audacity
- Authentication
- BEC
- Black Hat
- BSides
- Bug Bounty
- Center for Internet Security
- CHERI
- CISO
- Common Vulnerability Scoring System
- Confluence
- Cybercrime
- Cybersecurity
- Cybersecurity and Infrastructure Security Agency
- Cybersecurity Information Sharing Act
- Database
- Data Breach
- Data Protection
- Data Theft
- DDoS
- DeepSeek
- DEF CON
- Devops
- Digital certificate
- Encryption
- End Point Protection
- Exploit
- Firewall
- FOSDEM
- FOSS
- Gemini
- Google AI
- Google Project Zero
- GPT-3
- GPT-4
- Grab
- Graphics Interchange Format
- Hacker
- Hacking
- Hacktivism
- IDE
- Identity Theft
- Image compression
- Incident response
- Infosec
- Infrastructure Security
- Jenkins
- Kenna Security
- Large Language Model
- Legacy Technology
- LibreOffice
- Machine Learning
- Map
- MCubed
- Microsoft 365
- Microsoft Office
- Microsoft Teams
- Mobile Device Management
- NCSAM
- NCSC
- Neural Networks
- NLP
- OpenOffice
- Palo Alto Networks
- Password
- Personally Identifiable Information
- Phishing
- Programming Language
- QR code
- Quantum key distribution
- Ransomware
- Remote Access Trojan
- Retrieval Augmented Generation
- Retro computing
- REvil
- RSA Conference
- Search Engine
- Software Bill of Materials
- Software bug
- Software License
- Spamming
- Spyware
- Star Wars
- Surveillance
- Tensor Processing Unit
- Text Editor
- TLS
- TOPS
- Trojan
- Trusted Platform Module
- User interface
- Visual Studio
- Visual Studio Code
- Vulnerability
- Wannacry
- WebAssembly
- Web Browser
- WordPress
- Zero trust
