ReasonXL: Shifting LLM Reasoning Language Without Sacrificing Performance
arXiv cs.CL / 4/15/2026
📰 NewsSignals & Early TrendsIdeas & Deep AnalysisModels & Research
Key Points
- The paper highlights a persistent gap in multilingual LLMs: even for non-English tasks, models often generate reasoning traces in English, creating a mismatch for non-English use cases.
- It introduces ReasonXL, a large cross-domain parallel corpus of reasoning traces across five European languages (EN/DE/FR/IT/ES) with 2M+ aligned samples per language, including prompts, reasoning traces, and final outputs.
- Using ReasonXL, the authors show language-specific reasoning can be achieved via a two-stage pipeline—SFT followed by RL with verifiable rewards (RLVR)—while maintaining or improving baseline performance with minimal general-knowledge degradation.
- Representational analysis finds early network layers form an “activation bottleneck” that causally governs language identity, while later layers absorb most adaptation-driven changes.
- RLVR can produce greater behavioral divergence from the base model than SFT even with smaller parameter updates, indicating a more efficient way to reroute representations toward target-language reasoning.
Related Articles

Black Hat Asia
AI Business
Are gamers being used as free labeling labor? The rise of "Simulators" that look like AI training grounds [D]
Reddit r/MachineLearning

I built a trading intelligence MCP server in 2 days — here's how
Dev.to

Big Tech firms are accelerating AI investments and integration, while regulators and companies focus on safety and responsible adoption.
Dev.to
Qwen3.5-35B running well on RTX4060 Ti 16GB at 60 tok/s
Reddit r/LocalLLaMA