Resolving the Robustness-Precision Trade-off in Financial RAG through Hybrid Document-Routed Retrieval

arXiv cs.CL / 3/31/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

共有:

Key Points

Financial RAG systems that use chunk-based similarity retrieval can suffer from cross-document chunk confusion in structurally similar corpora like regulatory filings.
Semantic File Routing (SFR) improves robustness by routing queries to whole documents with LLM-structured outputs, but it reduces the precision achievable with targeted chunk retrieval.
An evaluation on the FinDER benchmark finds a clear robustness–precision trade-off: SFR has higher average scores and fewer failures, while chunk-based retrieval achieves more perfect answers.
The paper introduces Hybrid Document-Routed Retrieval (HDRR), a two-stage approach that uses SFR to filter to relevant documents and then applies chunk retrieval within those scoped documents.
HDRR outperforms both baselines across all metrics, reaching the highest average score and precision while achieving the lowest failure rate, indicating it resolves the trade-off effectively.

Abstract

Retrieval-Augmented Generation (RAG) systems for financial document question answering typically follow a chunk-based paradigm: documents are split into fragments, embedded into vector space, and retrieved via similarity search. While effective in general settings, this approach suffers from cross-document chunk confusion in structurally homogeneous corpora such as regulatory filings. Semantic File Routing (SFR), which uses LLM structured output to route queries to whole documents, reduces catastrophic failures but sacrifices the precision of targeted chunk retrieval. We identify this robustness-precision trade-off through controlled evaluation on the FinDER benchmark (1,500 queries across five groups): SFR achieves higher average scores (6.45 vs. 6.02) and fewer failures (10.3% vs. 22.5%), while chunk-based retrieval (CBR) yields more perfect answers (13.8% vs. 8.5%). To resolve this trade-off, we propose Hybrid Document-Routed Retrieval (HDRR), a two-stage architecture that uses SFR as a document filter followed by chunk-based retrieval scoped to the identified document(s). HDRR eliminates cross-document confusion while preserving targeted chunk precision. Experimental results demonstrate that HDRR achieves the best performance on every metric: an average score of 7.54 (25.2% above CBR, 16.9% above SFR), a failure rate of only 6.4%, a correctness rate of 67.7% (+18.7 pp over CBR), and a perfect-answer rate of 20.1% (+6.3 pp over CBR, +11.6 pp over SFR). HDRR resolves the trade-off by simultaneously achieving the lowest failure rate and the highest precision across all five experimental groups.

💡 Insights using this article

This article is featured in our daily AI news digest — key takeaways and action items at a glance.

📅 3/31DailyView insight →

[D] How does distributed proof of work computing handle the coordination needs of neural network training?

Reddit r/MachineLearning

BYOK is not just a pricing model: why it changes AI product trust

Dev.to

AI Citation Registries and Identity Persistence Across Records

Dev.to

Building Real-Time AI Voice Agents with Google Gemini 3.1 Flash Live and VideoSDK

Dev.to

Your Knowledge, Your Model: A Method for Deterministic Knowledge Externalization

Dev.to

Resolving the Robustness-Precision Trade-off in Financial RAG through Hybrid Document-Routed Retrieval

Key Points

Abstract

💡 Insights using this article

Related Articles

[D] How does distributed proof of work computing handle the coordination needs of neural network training?

BYOK is not just a pricing model: why it changes AI product trust

AI Citation Registries and Identity Persistence Across Records

Building Real-Time AI Voice Agents with Google Gemini 3.1 Flash Live and VideoSDK

Your Knowledge, Your Model: A Method for Deterministic Knowledge Externalization

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer