Salesforce AI Research Releases VoiceAgentRAG: A Dual-Agent Memory Router that Cuts Voice RAG Retrieval Latency by 316x

MarkTechPost / 3/30/2026

📰 NewsDeveloper Stack & InfrastructureSignals & Early TrendsModels & Research

共有:

Key Points

Salesforce AI Research introduced VoiceAgentRAG, a voice-focused Retrieval-Augmented Generation approach designed to meet a ~200ms response-time budget for natural conversations.
The system uses a dual-agent “memory router” to better select or route retrieval/memory queries, reducing voice RAG retrieval latency by 316x versus typical vector database querying approaches.
The work targets a key bottleneck in voice assistants: vector retrieval latency that is acceptable in text chat but problematic for real-time speech interactions.
By optimizing the retrieval step rather than the overall LLM generation loop alone, VoiceAgentRAG aims to improve perceived responsiveness and conversational quality in production voice AI deployments.

Continue reading this article on the original site.