AtlasOCR: Building the First Open-Source Darija OCR Model with Vision Language Models
arXiv cs.CV / 4/10/2026
📰 NewsSignals & Early TrendsModels & Research
Key Points
- The paper presents AtlasOCR, described as the first open-source OCR model tailored specifically for Darija (Moroccan Arabic), built by fine-tuning a 3B-parameter Vision Language Model.
- It details a data pipeline combining Darija-specific dataset curation with synthetic text generation (via the authors’ OCRSmith library) plus carefully sourced real-world samples.
- The authors use parameter-efficient fine-tuning (Q- LoRA) with Unsloth to efficiently train Qwen2.5-VL 3B, along with ablation studies to optimize training hyperparameters.
- AtlasOCR is evaluated on a new benchmark (AtlasOCRBench) and the established KITAB-Bench, where it reportedly achieves state-of-the-art results and demonstrates strong generalization across Darija and standard Arabic OCR tasks.
- The work positions the model as competitive with larger OCR systems, emphasizing robustness and transferability rather than relying solely on scale.



