Syntax as a Rosetta Stone: Universal Dependencies for In-Context Coptic Translation

arXiv cs.CL / 4/22/2026

📰 NewsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper introduces an in-context learning method for low-resource machine translation from Coptic to English, emphasizing that low-resource settings need different strategies than high-resource ones.
  • It augments model inputs with Universal Dependencies (UD) parses, experimenting with raw parser outputs, plain-English verbalizations of parses, and targeted instructions for difficult constructions.
  • The study finds that syntax alone is less effective than dictionary-based glosses, but combining retrieved dictionary items with syntactic information produces substantial improvements.
  • The proposed approach achieves new state-of-the-art translation results for Coptic, with gains observed across multiple model sizes.
  • Overall, the work positions UD-based syntactic augmentation as a practical way to improve translation quality when direct supervision or large parallel corpora are limited.

Abstract

Low-resource machine translation requires methods that differ from those used for high-resource languages. This paper proposes a novel in-context learning approach to support low-resource machine translation of the Coptic language to English, with syntactic augmentation from Universal Dependencies parses of input sentences. Building on existing work using bilingual dictionaries to support inference for vocabulary items, we add several representations of syntactic analyses to our inputs , specifically exploring the inclusion of raw parser outputs, verbalizations of parses in plain English, and targeted instructions of difficult constructions identified in sub-trees and how they can be translated. Our results show that while syntactic information alone is not as useful as dictionary-based glosses, combining retrieved dictionary items with syntactic information achieves significant gains across model sizes, achieving new state-of-the-art translation results for Coptic.