Multiple-Debias: A Full-process Debiasing Method for Multilingual Pre-trained Language Models

arXiv cs.CL / 4/6/2026

📰 NewsSignals & Early TrendsIdeas & Deep AnalysisModels & Research

共有:

Key Points

The paper presents Multiple-Debias, a full-process debiasing approach for multilingual pre-trained language models targeting sensitive-attribute biases such as gender, race, and religion.
It combines multilingual counterfactual data augmentation and multilingual Self-Debias across both pre-processing and post-processing stages, along with parameter-efficient fine-tuning to reduce bias.
Experiments report significant bias reductions across three sensitive attributes in four languages, using an extended CrowS-Pairs benchmark for German, Spanish, Chinese, and Japanese.
Results indicate that multilingual debiasing outperforms monolingual methods and that transferring debiasing signals across languages improves fairness.

Abstract

Multilingual Pre-trained Language Models (MPLMs) have become essential tools for natural language processing. However, they often exhibit biases related to sensitive attributes such as gender, race, and religion. In this paper, we introduce a comprehensive multilingual debiasing method named Multiple-Debias to address these issues across multiple languages. By incorporating multilingual counterfactual data augmentation and multilingual Self-Debias across both pre-processing and post-processing stages, alongside parameter-efficient fine-tuning, we significantly reduced biases in MPLMs across three sensitive attributes in four languages. We also extended CrowS-Pairs to German, Spanish, Chinese, and Japanese, validating our full-process multilingual debiasing method for gender, racial, and religious bias. Our experiments show that (i) multilingual debiasing methods surpass monolingual approaches in effectively mitigating biases, and (ii) integrating debiasing information from different languages notably improves the fairness of MPLMs.