Multilingual KokoroChat: A Multi-LLM Ensemble Translation Method for Creating a Multilingual Counseling Dialogue Dataset
arXiv cs.CL / 3/25/2026
📰 NewsSignals & Early TrendsIdeas & Deep AnalysisModels & Research
Key Points
- Multilingual KokoroChat is a new dataset that translates a large manually authored Japanese counseling dialogue corpus (KokoroChat) into English and Chinese to address limited availability of high-quality public counseling data.
- Because translation quality depends on the input and no single LLM can consistently be best, the authors introduce a multi-LLM ensemble translation pipeline tailored for high-fidelity output in a sensitive domain.
- The method generates diverse translation hypotheses using multiple distinct LLMs, then uses a separate LLM to select and refine the final translation by analyzing strengths and weaknesses across the hypotheses.
- Human preference studies validate that translations from the ensemble approach are preferred over those produced by any individual state-of-the-art LLM, indicating improved fidelity.
- The dataset is released publicly on GitHub, enabling researchers to build and evaluate multilingual counseling dialogue systems using higher-quality training material.
Related Articles
Santa Augmentcode Intent Ep.6
Dev.to

Your Agent Hired Another Agent. The Output Was Garbage. The Money's Gone.
Dev.to
ClawRouter vs TeamoRouter: one requires a crypto wallet, one doesn't
Dev.to
Big Tech firms are accelerating AI investments and integration, while regulators and companies focus on safety and responsible adoption.
Dev.to

Palantir’s billionaire CEO says only two kinds of people will succeed in the AI era: trade workers — ‘or you’re neurodivergent’
Reddit r/artificial