Agentar-Fin-OCR
arXiv cs.CV / 3/12/2026
📰 NewsTools & Practical UsageModels & Research
Key Points
- Agentar-Fin-OCR is introduced as a document parsing system tailored to financial-domain documents that converts ultra-long PDFs into semantically consistent, structured outputs with auditing-grade provenance.
- It combines Cross-page Contents Consolidation and Document-level Heading Hierarchy Reconstruction to restore continuity across pages and build a globally consistent TOC for structure-aware retrieval, along with a difficulty-adaptive curriculum learning strategy and a CellBBoxRegressor to localize table cells from decoder states without external detectors.
- The work introduces FinDocBench, a benchmark with six financial document categories and metrics like TocEDS, cross-page TEDS, and Table Cell IoU to evaluate table parsing across finance documents.
- Experiments show state-of-the-art models on FinDocBench and position Agentar-Fin-OCR as a practical foundation for reliable downstream financial document applications.
Related Articles

ベテランの若手育成負担を減らせ、PLC制御の「ラダー図」をAIで生成
日経XTECH

Hey dev.to community – sharing my journey with Prompt Builder, Insta Posts, and practical SEO
Dev.to

Why Regex is Not Enough: Building a Deterministic "Sudo" Layer for AI Agents
Dev.to

Perplexity Hub
Dev.to

How to Build Passive Income with AI in 2026: A Developer's Practical Guide
Dev.to