Towards Robust Real-World Spreadsheet Understanding with Multi-Agent Multi-Format Reasoning
arXiv cs.CL / 4/15/2026
💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research
Key Points
- The paper argues that existing LLM-based spreadsheet understanding often fails by treating tables as plain text and missing key layout and visual semantics needed for real-world auditing and reporting.
- It introduces SpreadsheetAgent, a two-stage multi-agent framework that performs incremental, localized reading using multiple modalities (e.g., code execution results, images, and LaTeX table content) rather than ingesting entire large spreadsheets at once.
- In the first stage, SpreadsheetAgent builds a structural “sketch” with row/column summaries, and in the second stage it executes task-driven reasoning over this intermediate representation.
- To improve reliability, the system includes a verification module that performs targeted inspections to validate extracted structures and reduce downstream error propagation.
- Experiments on two datasets show improved benchmark performance, with SpreadsheetAgent reaching 38.16% on Spreadsheet Bench versus 35.27% for a ChatGPT Agent baseline, and the authors release code publicly.




