Budget-Xfer: Budget-Constrained Source Language Selection for Cross-Lingual Transfer to African Languages
arXiv cs.CL / 3/31/2026
📰 NewsSignals & Early TrendsIdeas & Deep AnalysisModels & Research
Key Points
- The paper introduces Budget-Xfer, a framework for selecting multiple source languages and allocating a fixed annotation budget for cross-lingual transfer to low-resource African languages.
- By modeling source selection as a budget-constrained resource allocation problem, the study aims to disentangle language-selection effects from the confounding impact of total training data.
- Experiments on named entity recognition and sentiment analysis for Hausa, Yoruba, and Swahili (288 runs using two multilingual models) show multi-source transfer substantially beats single-source transfer, with Cohen’s d ranging from 0.80 to 1.98.
- The authors find that, among multi-source allocation strategies, performance differences are generally modest and statistically non-significant.
- They also report that using embedding similarity as a selection proxy is task-dependent: random source selection performs better for NER, while similarity-based selection is not superior for sentiment analysis.
Related Articles

Black Hat Asia
AI Business
[D] How does distributed proof of work computing handle the coordination needs of neural network training?
Reddit r/MachineLearning

Claude Code's Entire Source Code Was Just Leaked via npm Source Maps — Here's What's Inside
Dev.to

BYOK is not just a pricing model: why it changes AI product trust
Dev.to

AI Citation Registries and Identity Persistence Across Records
Dev.to