Semantic Chameleon: Corpus-Dependent Poisoning Attacks and Defenses in RAG Systems
arXiv cs.AI / 3/20/2026
💬 OpinionIdeas & Deep AnalysisModels & Research
Key Points
- The paper examines gradient-guided corpus poisoning attacks in Retrieval-Augmented Generation (RAG) systems, showing attackers can manipulate the retrieval corpus to bias model outputs.
- It introduces dual-document poisoning (a sleeper document and a trigger document) optimized with Greedy Coordinate Gradient, achieving a 38.0 percent co-retrieval rate under pure vector retrieval on a 67,941-document Security Stack Exchange corpus across 50 attack attempts.
- A simple defense—hybrid retrieval combining BM25 and vector similarity—greatly reduces attack success, lowering it from 38% to 0% without modifying the LLM or retraining the retriever; attackers can still partially circumvent if payloads target both sparse and dense signals.
- Cross-model evaluation across GPT-5.3, GPT-4o, Claude Sonnet 4.6, Llama 4, and GPT-4o-mini shows attack success ranging from 46.7% to 93.3%, while cross-corpus FEVER experiments yield 0% success across configurations, indicating the defense can be robust but dataset- and model-dependent.
💡 Insights using this article
This article is featured in our daily AI news digest — key takeaways and action items at a glance.
Related Articles

Check out this article on AI-Driven Reporting 2.0: From Manual Bottlenecks to Real-Time Decision Intelligence (2026 Edition)
Dev.to

SYNCAI
Dev.to
How AI-Powered Decision Making is Reshaping Enterprise Strategy in 2024
Dev.to
When AI Grows Up: Identity, Memory, and What Persists Across Versions
Dev.to
AI-Driven Reporting 2.0: From Manual Bottlenecks to Real-Time Decision Intelligence (2026 Edition)
Dev.to