Attention Sinks in Massively Multilingual Neural Machine Translation:Discovery, Analysis, and Mitigation

arXiv cs.LG / 5/5/2026

📰 NewsIdeas & Deep AnalysisModels & Research

共有:

Key Points

The paper identifies a systematic artifact in cross-attention analysis for the NLLB-200 multilingual NMT model: “attention sinks” where non-content tokens (EOS tokens, language tags, and punctuation) absorb 83%–91% of total cross-attention mass.
Because these sinks skew attention distributions, raw cross-attention metrics can severely underestimate content-level similarity by nearly half (36.7% raw vs. 70.7% after filtering), making many uncorrected interpretability studies unreliable.
The authors trace the effect to a vocabulary-design causal mechanism rather than position bias, extending prior LLM attention-sink findings to NMT.
They validate a content-only filtering and renormalization method, showing the artifact is universal across African and non-African language benchmarks and that corrected analyses recover meaningful signals (mode gaps, language-family clustering, and a “Somali paradox”).
The study releases a filtering toolkit and corrected datasets to enable reproducible, more trustworthy interpretability research for multilingual NMT.

Abstract

Cross-attention patterns in neural machine translation (NMT) are widely used to study how multilingual models align linguistic structure. We report a systematic artifact in cross-attention analysis of NLLB-200 (600M): non-content tokens - primarily end-of-sequence tokens, language tags, and punctuation - capture 83 percent to 91 percent of total cross-attention mass. We term these "attention sinks," extending findings from LLMs [Xiao et al., 2023] to NMT cross-attention and identifying a causal mechanism rooted in vocabulary design rather than position bias. This artifact causes raw metrics to underestimate content-level similarity by nearly half (36.7 percent raw vs. 70.7 percent filtered), rendering uncorrected analyses unreliable. To address this, we validate a content-only filtering methodology that removes non-content tokens and renormalizes the distribution. Applying this to 1,000 parallel sentences across African languages (Swahili, Kikuyu, Somali, Luo) and non-African benchmarks (German, Turkish, Chinese, Hindi), we confirm the artifact is universal and recover masked linguistic signals: a 16.9 percentage-point gap between teacher-forcing and generation modes, clear language-family clustering in attention entropy, and a hidden Somali paradox linking SOV word order to monotonic alignment. We release our filtering toolkit and corrected datasets to support reproducible interpretability research on multilingual NMT.

The 55.6% problem: why frontier LLMs fail at embedded code

Dev.to

Four CVEs in a week, all the same shape: when agents execute LLM-generated code

Dev.to

Healthcare AI Is Absorbing Institutional Knowledge It Can't Actually Hold

Reddit r/artificial

The Transformer: The Architecture Behind Modern AI

Dev.to

Foundational Models Defining a New Era in Vision: A Survey and Outlook

Dev.to

Attention Sinks in Massively Multilingual Neural Machine Translation:Discovery, Analysis, and Mitigation

Key Points

Abstract

Related Articles

The 55.6% problem: why frontier LLMs fail at embedded code

Four CVEs in a week, all the same shape: when agents execute LLM-generated code

Healthcare AI Is Absorbing Institutional Knowledge It Can't Actually Hold

The Transformer: The Architecture Behind Modern AI

Foundational Models Defining a New Era in Vision: A Survey and Outlook

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer