Adaptive Defense Orchestration for RAG: A Sentinel-Strategist Architecture against Multi-Vector Attacks

arXiv cs.AI / 4/25/2026

💬 OpinionDeveloper Stack & InfrastructureIdeas & Deep AnalysisModels & Research

Key Points

  • The paper highlights that RAG systems used in sensitive domains (e.g., healthcare and law) face security risks such as membership inference, data poisoning, and unintended content leakage.
  • It finds that enabling a full, always-on defense stack can severely hurt RAG utility, with experiments showing retrieval contextual recall drops by over 40% because retrieval degradation is a primary failure mode.
  • To address this security–utility trade-off, the authors propose the Sentinel-Strategist (ADO) architecture, where a Sentinel detects anomalous retrieval behavior and a Strategist selects defenses contextually for each query.
  • Across three benchmark datasets and five orchestration models, ADO largely eliminates MBA-style membership-inference leakage while recovering retrieval utility close to an undefended baseline, and under data poisoning it drives attack success to near zero while restoring recall to over 75%—though performance is sensitive to the chosen model.
  • Overall, the results suggest that adaptive, query-aware defense orchestration can substantially improve robustness without paying the heavy utility costs of static defenses.

Abstract

Retrieval-augmented generation (RAG) systems are increasingly deployed in sensitive domains such as healthcare and law, where they rely on private, domain-specific knowledge. This capability introduces significant security risks, including membership inference, data poisoning, and unintended content leakage. A straightforward mitigation is to enable all relevant defenses simultaneously, but doing so incurs a substantial utility cost. In our experiments, an always-on defense stack reduces contextual recall by more than 40%, indicating that retrieval degradation is the primary failure mode. To mitigate this trade-off in RAG systems, we propose the Sentinel-Strategist architecture, a context-aware framework for risk analysis and defense selection. A Sentinel detects anomalous retrieval behavior, after which a Strategist selectively deploys only the defenses warranted by the query context. Evaluated across three benchmark datasets and five orchestration models, ADO is shown to eliminate MBA-style membership inference leakage while substantially recovering retrieval utility relative to a fully static defense stack, approaching undefended baseline levels. Under data poisoning, the strongest ADO variants reduce attack success to near zero while restoring contextual recall to more than 75% of the undefended baseline, although robustness remains sensitive to model choice. Overall, these findings show that adaptive, query-aware defense can substantially reduce the security-utility trade-off in RAG systems.