Less Languages, Less Tokens: An Efficient Unified Logic Cross-lingual Chain-of-Thought Reasoning Framework

arXiv cs.CL / 4/23/2026

📰 NewsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper introduces UL-XCoT, an efficient unified-logic framework for cross-lingual chain-of-thought reasoning that reduces both latency and redundant token use compared with costly full-trajectory sampling methods.
  • UL-XCoT improves efficiency by selecting a small candidate set of languages per query within a language-invariant unified logic space, rather than evaluating many languages broadly.
  • During decoding, it prunes low-quality reasoning paths by monitoring trajectory dynamics in the logic space, and then aggregates the remaining high-quality trajectories using voting.
  • Experiments on PolyMath (18 languages) and MMLU-ProX-Lite (29 languages) using DeepSeek-R1-DistillQwen-7B show competitive accuracy while cutting decoding token costs by over 50% versus prior sampling baselines.
  • The method provides more stable gains for low-resource languages, where standard XCoT self-consistency approaches often fail to deliver consistent improvements.

Abstract

Cross-lingual chain-of-thought (XCoT) with self-consistency markedly enhances multilingual reasoning, yet existing methods remain costly due to extensive sampling of full trajectories across languages. Moreover, multilingual LLM representations vary strongly by language, hindering direct feature comparisons and effective pruning. Motivated by this, we introduce UL-XCoT, the first efficient unified logic cross-lingual reasoning framework that minimizes redundancy in token usage and latency, yielding the greatest efficiency under limited sampling budgets during inference. Specifically, UL-XCoT (1) achieves less languages by selecting, per query, a small candidate language set in a language-invariant unified logic space, (2) enables less tokens by monitoring logic-space trajectory dynamics during decoding to prune low-quality reasoning paths, and (3) aggregates the remaining high-quality trajectories via voting. Experiments on PolyMath across 18 languages and MMLU-ProX-Lite across 29 languages with DeepSeek-R1-DistillQwen-7B demonstrate that UL-XCoT achieves competitive accuracy while sharply cutting over 50% decoding token cost versus prior sampling baselines. UL-XCoT also delivers more stable gains on low-resource languages, underscoring consistently superior robustness where standard XCoT self-consistency method fails.