Supercharging Federated Intelligence Retrieval

arXiv cs.CL / 3/27/2026

💬 OpinionDeveloper Stack & InfrastructureIdeas & Deep AnalysisModels & Research

Key Points

  • The paper addresses a key limitation of conventional RAG by enabling retrieval when documents are distributed across private silos rather than centrally accessible.
  • It proposes a secure Federated RAG architecture using Flower, where each silo performs local retrieval while server-side aggregation and text generation occur inside an attested confidential compute environment.
  • The design targets confidential remote LLM inference even under honest-but-curious or potentially compromised servers by leveraging enclave-style attestation and protected execution.
  • It introduces a cascading inference method that can use a non-confidential third-party model (e.g., Amazon Nova) as auxiliary context without compromising the confidentiality guarantees.

Abstract

RAG typically assumes centralized access to documents, which breaks down when knowledge is distributed across private data silos. We propose a secure Federated RAG system built using Flower that performs local silo retrieval, while server-side aggregation and text generation run inside an attested, confidential compute environment, enabling confidential remote LLM inference even in the presence of honest-but-curious or compromised servers. We also propose a cascading inference approach that incorporates a non-confidential third-party model (e.g., Amazon Nova) as auxiliary context without weakening confidentiality.
広告