Left Behind: Cross-Lingual Transfer as a Bridge for Low-Resource Languages in Large Language Models

arXiv cs.CL / 3/24/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

共有:

Key Points

The paper benchmarks eight large language models on English, Kazakh, and Mongolian using 50 hand-crafted questions and evaluates 2,000 responses across accuracy, fluency, and completeness.
Results show a consistent 13.8–16.7 percentage point performance gap versus English, where models often preserve surface-level fluency but generate substantially less accurate outputs in low-resource languages.
Cross-lingual transfer prompting (reason in English, then translate back) yields selective improvements for bilingual architectures (+2.2pp to +4.3pp) but offers no benefit for English-dominant models.
The study concludes that current LLMs systematically under-serve low-resource language communities and that mitigation effectiveness depends on model architecture rather than a single universal prompting strategy.

Abstract

We investigate how large language models perform on low-resource languages by benchmarking eight LLMs across five experimental conditions in English, Kazakh, and Mongolian. Using 50 hand-crafted questions spanning factual, reasoning, technical, and culturally grounded categories, we evaluate 2,000 responses on accuracy, fluency, and completeness. We find a consistent performance gap of 13.8-16.7 percentage points between English and low-resource language conditions, with models maintaining surface-level fluency while producing significantly less accurate content. Cross-lingual transfer-prompting models to reason in English before translating back-yields selective gains for bilingual architectures (+2.2pp to +4.3pp) but provides no benefit to English-dominant models. Our results demonstrate that current LLMs systematically underserve low-resource language communities, and that effective mitigation strategies are architecture-dependent rather than universal.

The Security Gap in MCP Tool Servers (And What I Built to Fix It)

Dev.to

Adversarial AI framework reveals mechanisms behind impaired consciousness and a potential therapy

Reddit r/artificial

Why I Switched From GPT-4 to Small Language Models for Two of My Products

Dev.to

Orchestrating AI Velocity: Building a Decoupled Control Plane for Agentic Development

Dev.to

In the Kadrey v. Meta Platforms case, Judge Chabbria's quest to bust the fair use copyright defense to generative AI training rises from the dead!

Reddit r/artificial

Left Behind: Cross-Lingual Transfer as a Bridge for Low-Resource Languages in Large Language Models

Key Points

Abstract

Related Articles

The Security Gap in MCP Tool Servers (And What I Built to Fix It)

Adversarial AI framework reveals mechanisms behind impaired consciousness and a potential therapy

Why I Switched From GPT-4 to Small Language Models for Two of My Products

Orchestrating AI Velocity: Building a Decoupled Control Plane for Agentic Development

In the Kadrey v. Meta Platforms case, Judge Chabbria's quest to bust the fair use copyright defense to generative AI training rises from the dead!

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer