An End-to-End Ukrainian RAG for Local Deployment. Optimized Hybrid Search and Lightweight Generation

arXiv cs.CL / 4/27/2026

📰 NewsDeveloper Stack & InfrastructureTools & Practical UsageModels & Research

Key Points

  • The paper introduces an end-to-end, locally deployable Ukrainian Retrieval-Augmented Generation (RAG) system for document question answering that secured 2nd place in the UNLP 2026 Shared Task.
  • It uses a custom two-stage hybrid search pipeline to retrieve relevant document pages, then generates grounded answers using a Ukrainian language model fine-tuned on synthetic data.
  • The authors compress the model to enable lightweight deployment, aiming to maintain answer quality while reducing compute requirements.
  • Experiments under strict computational limits show that verifiable, high-quality AI QA can run locally on resource-constrained hardware without sacrificing accuracy.

Abstract

This paper presents a highly efficient Retrieval-Augmented Generation (RAG) system built specifically for Ukrainian document question answering, which achieved 2nd place in the UNLP 2026 Shared Task. Our solution features a custom two-stage search pipeline that retrieves relevant document pages, paired with a specialized Ukrainian language model fine-tuned on synthetic data to generate accurate, grounded answers. Finally, we compress the model for lightweight deployment. Evaluated under strict computational limits, our architecture demonstrates that high-quality, verifiable AI question answering can be achieved locally on resource-constrained hardware without sacrificing accuracy.