QU-NLP at QIAS 2026: Multi-Stage QLoRA Fine-Tuning for Arabic Islamic Inheritance Reasoning

arXiv cs.CL / 4/21/2026

📰 NewsDeveloper Stack & InfrastructureModels & Research

Key Points

  • QU-NLP is the submitted system for the QIAS 2026 shared task on Arabic Islamic inheritance (ilm al-mawarith) reasoning, targeting structured multi-step legal analysis and fractional calculations.
  • The method uses a multi-stage fine-tuning pipeline on Qwen3-4B: domain adaptation on 3,166 fatwa records followed by task-specific training on 12,000 structured inheritance cases to generate JSON-formatted answers.
  • Training leverages 4-bit NF4 quantization with rank-128 QLoRA adapters, aiming to reduce compute while maintaining reasoning quality.
  • The model reportedly reaches a 90% MIR-E score on the test set and is described as competitive with commercial systems such as Gemini-2.5-flash.
  • The work suggests that domain-specific pre-adaptation combined with structured-output training can make small language models effective for complex legal reasoning tasks.

Abstract

Islamic inheritance law (ilm al-mawar{\i}th) presents a challenging domain for evaluating large language models' structured reasoning capabilities, requiring multi-step legal analysis, rule-based blocking decisions, and precise fractional calculations. We present QU-NLP's submission to the QIAS 2026 shared task on Arabic Islamic inheritance reasoning. Our approach employs a multi-stage Quantized Low-Rank Adaptation (QLoRA) fine-tuning strategy on Qwen3-4B: (1) domain adaptation on 3,166 Islamic fatwa records to acquire inheritance terminology and jurisprudential reasoning patterns, followed by (2) task-specific training on 12,000 structured inheritance cases to optimize JSON-formatted output generation. Using 4-bit NF4 quantization with rank-128 LoRA adapters, our model achieves 90% MIR-E (Mawarith Inheritance Reasoning Evaluation) score on the test set, demonstrating competitive performance while requiring minimal computational resources. Our results show that domain-specific pre-adaptation combined with structured output training enables small language models to perform complex legal reasoning tasks effectively comparing to commercial systems such as Gemini-2.5-flash.