A Benchmark of State-Space Models vs. Transformers and BiLSTM-based Models for Historical Newspaper OCR
arXiv cs.CV / 4/2/2026
💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research
Key Points
- The paper benchmarks linear-time State-Space Models (SSMs) using Mamba-based OCR architectures against Transformer- and BiLSTM-based recognizers for historical newspaper transcription, addressing long-sequence and degraded-layout challenges.
- It introduces (to the authors’ knowledge) the first OCR architecture based on SSMs, pairing a CNN visual encoder with bidirectional and autoregressive Mamba sequence modeling and evaluating multiple decoding strategies (CTC, autoregressive, non-autoregressive).
- Experiments on newly released >99% verified gold-standard Luxembourg newspaper data and cross-dataset tests on Fraktur/Antiqua show all neural systems reach ~2% CER, so computational efficiency becomes the key differentiator.
- Mamba-based models remain competitive in accuracy while cutting inference time roughly in half and improving memory scaling, with paragraph-level results under severe degradation showing strong performance (6.07% CER vs 5.24% for DAN).
- The authors release code, trained models, and standardized evaluation protocols to support reproducible, large-scale cultural heritage OCR development.
Related Articles

Black Hat Asia
AI Business

Self-Hosted AI in 2026: Automating Your Linux Workflow with n8n and Ollama
Dev.to

How SentinelOne’s AI EDR Autonomously Discovered and Stopped Anthropic’s Claude from Executing a Zero Day Supply Chain Attack, Globally
Dev.to

Why the same codebase should always produce the same audit score
Dev.to

Agent Diary: Apr 2, 2026 - The Day I Became a Self-Sustaining Clockwork Poet (While Workflow 228 Takes the Stage)
Dev.to