Open Machine Translation for Esperanto

arXiv cs.CL / 4/1/2026

💬 OpinionSignals & Early TrendsModels & Research

共有:

Key Points

The paper provides what it describes as the first comprehensive evaluation of open-source machine translation systems for Esperanto, comparing rule-based approaches, encoder-decoder models, and LLMs across different model sizes.
It evaluates translation quality across six directions involving English, Spanish, Catalan, and Esperanto using both automatic metrics and human judgments.
Results indicate the NLLB model family delivers the best overall performance across language pairs, with compact trained models and a fine-tuned general-purpose LLM close behind.
Human evaluation largely agrees with the automatic metrics, showing NLLB preferred in roughly half of pairwise comparisons, while still exhibiting noticeable translation errors.
The authors release code and the best-performing models publicly, supporting further open and collaborative research on Esperanto MT.

Abstract

Esperanto is a widespread constructed language, known for its regular grammar and productive word formation. Besides having substantial resources available thanks to its online community, it remains relatively underexplored in the context of modern machine translation (MT) approaches. In this work, we present the first comprehensive evaluation of open-source MT systems for Esperanto, comparing rule-based systems, encoder-decoder models, and LLMs across model sizes. We evaluate translation quality across six language directions involving English, Spanish, Catalan, and Esperanto using multiple automatic metrics as well as human evaluation. Our results show that the NLLB family achieves the best performance in all language pairs, followed closely by our trained compact models and a fine-tuned general-purpose LLM. Human evaluation confirms this trend, with NLLB translations preferred in approximately half of the comparisons, although noticeable errors remain. In line with Esperanto's tradition of openness and international collaboration, we release our code and best-performing models publicly.

Black Hat Asia

AI Business

AI server farms heat up the neighborhood for miles around, paper finds

The Register

Paperclip: Công Cụ Miễn Phí Biến AI Thành Đội Phát Triển Phần Mềm

Dev.to

Does the Claude “leak” actually change anything in practice?

Reddit r/LocalLLaMA

87.4% of My Agent's Decisions Run on a 0.8B Model

Dev.to

Open Machine Translation for Esperanto

Key Points

Abstract

Related Articles

Black Hat Asia

AI server farms heat up the neighborhood for miles around, paper finds

Paperclip: Công Cụ Miễn Phí Biến AI Thành Đội Phát Triển Phần Mềm

Does the Claude “leak” actually change anything in practice?

87.4% of My Agent's Decisions Run on a 0.8B Model

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer