TableNet A Large-Scale Table Dataset with LLM-Powered Autonomous

arXiv cs.AI / 4/16/2026

💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research

共有:

Key Points

The paper introduces TableNet, a large-scale table structure recognition (TSR) dataset created from multiple sources to address limitations in current TSR dataset scale and quality.
It proposes a first-of-its-kind LLM-powered autonomous multi-agent system that generates table images using controllable visual, structural, and semantic parameters while producing coherent annotations at scale.
For model training, the authors apply a diversity-based active learning strategy that selects the most informative tables across sources to fine-tune a TSR model while reducing the number of required training samples.
Reported results indicate competitive performance on the TableNet test set and stronger generalization to web-crawled real-world tables compared with models trained on predominantly single-dataset sources.
The work claims novelty in combining diversity-based active learning with TSR settings that vary across rows/columns, merged cells, and cell contents, enabling more efficient dataset/model development for table-related domains.

Abstract

Table Structure Recognition (TSR) requires the logical reasoning ability of large language models (LLMs) to handle complex table layouts, but current datasets are limited in scale and quality, hindering effective use of this reasoning capacity. We thus present TableNet dataset, a new table structure recognition dataset collected and generated through multiple sources. Central to our approach is the first LLM-powered autonomous table generation and recognition multi-agent system that we developed. The generation part of our system integrates controllable visual, structural, and semantic parameters into the synthesis of table images. It facilitates the creation of a wide array of semantically coherent tables, adaptable to user-defined configurations along with annotations, thereby supporting large-scale and detailed dataset construction. This capability enables a comprehensive and nuanced table image annotation taxonomy, potentially advancing research in table-related domains. In contrast to traditional data collection methods, This approach facilitates the theoretically infinite, domain-agnostic, and style-flexible generation of table images, ensuring both efficiency and precision. The recognition part of our system is a diversity-based active learning paradigm that utilizes tables from multiple sources and selectively samples most informative data to finetune a model, achieving a competitive performance on TableNet test set while reducing training samples by a large margin compared with baselines, and a much higher performance on web-crawled real-world tables compared with models trained on predominant table datasets. To the best of our knowledge, this is the first work which employs active learning into the structure recognition of tables which is diverse in numbers of rows or columns, merged cells, cell contents, etc, which fits better for diversity-based active learning.