Fanar 2.0: Arabic Generative AI Stack
arXiv cs.CL / 3/18/2026
📰 NewsDeveloper Stack & InfrastructureTools & Practical UsageModels & Research
Key Points
- Fanar 2.0 is the second generation of Qatar's Arabic-centric Generative AI platform, designed and operated entirely in-house at QCRI with sovereignty as a core principle.
- It runs on 256 NVIDIA H100 GPUs and uses a data-quality-first strategy with targeted continual pre-training and model merging to achieve gains while using 8x fewer pre-training tokens than Fanar 1.0.
- The core Fanar-27B model is continually pre-trained from a Gemma-3-27B backbone on a curated corpus of 120 billion high-quality tokens across three data recipes, delivering benchmark gains of Arabic knowledge by 9.1 points, language by 7.3 points, dialects by 3.5 points, and English capability by 7.6 points.
- The Fanar 2.0 stack adds capabilities including FanarGuard moderation, Aura long-form ASR, Oryx Arabic-aware image/video understanding and generation, an agentic tool-calling framework for multi-step workflows, Fanar-Sadiq for Islamic content, Fanar-Diwan for classical Arabic poetry generation, FanarShaheen bilingual translation, and a redesigned multi-layer orchestrator for intent-aware routing and safety validation, collectively showing sovereign, resource-constrained AI can rival larger-scale systems.
Related Articles
ベテランの若手育成負担を減らせ、PLC制御の「ラダー図」をAIで生成
日経XTECH
Hey dev.to community – sharing my journey with Prompt Builder, Insta Posts, and practical SEO
Dev.to
Why Regex is Not Enough: Building a Deterministic "Sudo" Layer for AI Agents
Dev.to
Perplexity Hub
Dev.to
How to Build Passive Income with AI in 2026: A Developer's Practical Guide
Dev.to