Towards Robust Real-World Spreadsheet Understanding with Multi-Agent Multi-Format Reasoning

arXiv cs.CL / 4/15/2026

💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research

共有:

Key Points

The paper argues that existing LLM-based spreadsheet understanding often fails by treating tables as plain text and missing key layout and visual semantics needed for real-world auditing and reporting.
It introduces SpreadsheetAgent, a two-stage multi-agent framework that performs incremental, localized reading using multiple modalities (e.g., code execution results, images, and LaTeX table content) rather than ingesting entire large spreadsheets at once.
In the first stage, SpreadsheetAgent builds a structural “sketch” with row/column summaries, and in the second stage it executes task-driven reasoning over this intermediate representation.
To improve reliability, the system includes a verification module that performs targeted inspections to validate extracted structures and reduce downstream error propagation.
Experiments on two datasets show improved benchmark performance, with SpreadsheetAgent reaching 38.16% on Spreadsheet Bench versus 35.27% for a ChatGPT Agent baseline, and the authors release code publicly.

Abstract

Spreadsheets are central to real-world applications such as enterprise reporting, auditing, and scientific data management. Despite their ubiquity, existing large language model based approaches typically treat tables as plain text, overlooking critical layout cues and visual semantics. Moreover, real-world spreadsheets are often massive in scale, exceeding the input length that LLMs can efficiently process. To address these challenges, we propose SpreadsheetAgent, a two-stage multi-agent framework for spreadsheet understanding that adopts a step-by-step reading and reasoning paradigm. Instead of loading the entire spreadsheet at once, SpreadsheetAgent incrementally interprets localized regions through multiple modalities, including code execution results, images, and LaTeX tables. The method first constructs a structural sketch and row/column summaries, and then performs task-driven reasoning over this intermediate representation in the Solving Stage. To further enhance reliability, we design a verification module that validates extracted structures via targeted inspections, reducing error propagation and ensuring trustworthy inputs for downstream reasoning. Extensive experiments on two spreadsheet datasets demonstrate the effectiveness of our approach. With GPT-OSS-120B, SpreadsheetAgent achieves 38.16% on Spreadsheet Bench, outperforming the ChatGPT Agent baseline (35.27%) by 2.89 absolute points. These results highlight the potential of SpreadsheetAgent to advance robust and scalable spreadsheet understanding in real-world applications. Code is available at https://github.com/renhouxing/SpreadsheetAgent.git.