AI Navigate

FGTR: Fine-Grained Multi-Table Retrieval via Hierarchical LLM Reasoning

arXiv cs.CL / 3/16/2026

💬 OpinionModels & Research

Key Points

  • FGTR introduces a hierarchical, fine-grained multi-table retrieval method for LLM-based tasks, addressing coarse encoding and scalability limitations of single-table approaches.
  • The method first identifies relevant schema elements and then retrieves the corresponding cell contents to construct a concise sub-table aligned with the query.
  • Experiments on Spider and BIRD benchmarks show significant improvements in the F2 metric (18% on Spider and 21% on BIRD) over prior state-of-the-art methods.
  • The approach demonstrates potential to enhance end-to-end performance on table-based downstream tasks by enabling more accurate, fine-grained retrieval across multiple tables.

Abstract

With the rapid advancement of large language models (LLMs), growing efforts have been made on LLM-based table retrieval. However, existing studies typically focus on single-table query, and implement it by similarity matching after encoding the entire table. These methods usually result in low accuracy due to their coarse-grained encoding which incorporates much query-irrelated data, and are also inefficient when dealing with large tables, failing to fully utilize the reasoning capabilities of LLM. Further, multi-table query is under-explored in retrieval tasks. To this end, we propose a hierarchical multi-table query method based on LLM: Fine-Grained Multi-Table Retrieval FGTR, a new retrieval paradigm that employs a human-like reasoning strategy. Through hierarchical reasoning, FGTR first identifies relevant schema elements and then retrieves the corresponding cell contents, ultimately constructing a concise and accurate sub-table that aligns with the given query. To comprehensively evaluate the performance of FGTR, we construct two new benchmark datasets based on Spider and BIRD . Experimental results show that FGTR outperforms previous state-of-the-art methods, improving the F_2 metric by 18% on Spider and 21% on BIRD, demonstrating its effectiveness in enhancing fine-grained retrieval and its potential to improve end-to-end performance on table-based downstream tasks.