Zero-Shot Synthetic-to-Real Handwritten Text Recognition via Task Analogies

arXiv cs.CV / 4/14/2026

💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper addresses a fully zero-shot synthetic-to-real handwritten text recognition problem, aiming to work on target languages without using any real target-domain handwriting data.
  • It learns how model parameters should change from synthetic to real handwriting across one or more source languages, then transfers these “corrections” to new target languages.
  • When multiple source languages are used, the method weights each source’s contribution based on linguistic similarity to better guide the transfer.
  • Experiments across five languages and six model architectures show consistent gains versus synthetic-only baselines, and the approach also helps even linguistically unrelated target languages.
  • The contribution is primarily a research method for robust HTR generalization that reduces or eliminates the need for costly target-domain real data adaptation.

Abstract

Handwritten Text Recognition (HTR) models trained on synthetic handwriting often struggle to generalize to real text, and existing adaptation methods still require real samples from the target domain. In this work, we tackle the fully zero-shot synthetic-to-real generalization setting, where no real data from the target language is available. Our approach learns how model parameters change when moving from synthetic to real handwriting in one or more source languages and transfers this learned correction to new target languages. When using multiple sources, we rely on linguistic similarity to weigh their contrubition when combining them. Experiments across five languages and six architectures show consistent improvements over synthetic-only baselines and reveal that the transferred corrections benefit even languages unrelated to the sources.