Language on Demand, Knowledge at Core: Composing LLMs with Encoder-Decoder Translation Models for Extensible Multilinguality
arXiv cs.CL / 4/1/2026
💬 OpinionIdeas & Deep AnalysisModels & Research
Key Points
- The paper argues that current LLMs hold substantial cross-lingual knowledge in a shared semantic space, but reliably using it for low-resource or unseen languages is still a major weakness.
- It proposes XBridge, a compositional encoder–LLM–decoder architecture that uses pretrained translation models to handle multilingual understanding and generation while keeping the LLM as an English-centric reasoning core.
- To fix representation misalignment between the LLM and translation models, XBridge adds lightweight cross-model mapping layers plus an optimal-transport-based alignment objective for semantic consistency.
- Experiments across four LLMs on multiple tasks (multilingual understanding, reasoning, summarization, and generation) show XBridge improves over strong baselines, with the biggest gains for low-resource and previously unseen languages, and does not require retraining the LLM.
- The work suggests a scalable pathway for extending LLM multilinguality by composing them with translation systems rather than treating multilingual capability as a monolithic model property.
Related Articles

Show HN: 1-Bit Bonsai, the First Commercially Viable 1-Bit LLMs
Dev.to

I Built an AI Agent That Can Write Its Own Tools When It Gets Stuck
Dev.to

Agent Self-Discovery: How AI Agents Find Their Own Wallets
Dev.to
[P] Federated Adversarial Learning
Reddit r/MachineLearning

The Inversion Error: Why Safe AGI Requires an Enactive Floor and State-Space Reversibility
Towards Data Science