Update: I fine-tuned Qwen3.5-0.8B for OCR and it outperforms my previous 2B release [GGUF]

Reddit r/LocalLLaMA / 4/14/2026

📰 NewsSignals & Early TrendsTools & Practical UsageModels & Research

Key Points

  • The author has released an updated OCR fine-tune of Qwen3.5, switching from a prior 2B model to a new Qwen3.5-0.8B version and reporting better performance on English archival and document OCR tasks.
  • The model is trained for markdown-first OCR output, including structured HTML tables, LaTeX for formulas, image tags for figures, and specialized chart extraction syntax.
  • The update emphasizes improved formatting and layout handling, including stronger preservation of reading order and support for more complex document layouts.
  • A public Hugging Face model link is provided, and the author plans to release additional language versions soon, including Arabic and broader RTL document OCR support.
  • Community feedback is requested, particularly for testing on messy scans and edge cases to validate robustness.

Hey everyone,

A while ago I shared my fine-tuned Qwen3.5-2B OCR model. Since then I kept working on the pipeline and just released a new version based on Qwen3.5-0.8B.

This one uses improved training samples and better output formatting, and it’s outperforming my previous 2B release on English archival and document OCR tasks.

It’s trained for markdown-first OCR output with HTML tables, LaTeX for formulas, [image] tags for figures/images, and [chart: ...] extraction for chart content. It also does a better job preserving reading order and more complex layouts.

Model link: loay/English-Document-OCR-Qwen3.5-0.8B

I’m planning to release versions for other languages soon as well, including Arabic and broader RTL document OCR support.

If you test it on messy scans or edge cases, I’d love to hear how it performs.

submitted by /u/Other-Confusion2974
[link] [comments]