Attribution-Guided Masking for Robust Cross-Domain Sentiment Classification

arXiv cs.LG / 5/6/2026

📰 NewsIdeas & Deep AnalysisModels & Research

共有:

Key Points

Pre-trained Transformer models often lose substantial accuracy when sentiment classifiers are transferred to out-of-domain data, and the work links this to dependence on domain-specific spurious tokens.
The paper shows that simply checking token-level attribution drift after the fact does not reliably predict the generalization gap, motivating a new training-time method.
It proposes Attribution-Guided Masking (AGM), which dynamically identifies and penalizes highly attributed spurious tokens during fine-tuning using a gradient-based masking loss, optionally with a counterfactual contrastive loss.
In strict zero-shot transfer across four sentiment domains (eight random seeds), AGM delivers competitive results on the hardest Sentiment140 transfer compared with several strong baselines and provides token-level interpretability about what drives failures.
Ablation experiments indicate that the attribution-guided masking component is essential, since removing it or using random token selection leads to worse performance on challenging transfers.

Abstract

While pre-trained Transformer models achieve high accuracy on in-domain sentiment classification, they frequently experience severe performance degradation when transferring to out-of-domain data. We hypothesize that this generalization gap is driven by reliance on domain-specific spurious tokens. After demonstrating that post-hoc-token-level attribution drift fails to predict this gap, we propose Attribution-Guided Masking (AGM), a training time intervention that dynamically detects and penalizes highly attributed spurious tokens during fine-tuning. AGM's core component is a gradient based attribution masking loss (

\mathcal{L}_{mask}

), which can optionally be combined with a counterfactual contrastive loss to enforce domain-invariant representations, all without requiring target-domain labels or human annotation. Evaluated in a strict zero-shot transfer setting across four diverse domains with eight random seeds, AGM achieves competitive generalization compared to five strong baselines on the hardest transfer (Sentiment140):

\Delta

= 0.244 versus DANN (0.264), DRO (0.248), Fish (0.247), and IRM (0.238), while uniquely providing token-level interpretability into which features drive the generalization gap. Our qualitative analysis confirms that AGM suppresses attribution on domain-specific tokens such as @mentions, hashtags, and slang, shifting reliance toward domain-invariant sentiment markers. Our ablation study further confirms that attribution-guided masking is the critical component: removing it or replacing it with random token selection consistently degrades performance on difficult transfers.

Top 10 Free AI Tools for Students in 2026: The Ultimate Study Guide

Dev.to

AI as Your Contingency Co-Pilot: Automating Wedding Day 'What-Ifs'

Dev.to

Google AI Releases Multi-Token Prediction (MTP) Drafters for Gemma 4: Delivering Up to 3x Faster Inference Without Quality Loss

MarkTechPost

When Claude Hallucinates in Court: The Latham & Watkins Incident and What It Means for Attorney Liability

MarkTechPost

Solidity LM surpasses Opus

Reddit r/LocalLLaMA

Attribution-Guided Masking for Robust Cross-Domain Sentiment Classification

Key Points

Abstract

Related Articles

Top 10 Free AI Tools for Students in 2026: The Ultimate Study Guide

AI as Your Contingency Co-Pilot: Automating Wedding Day 'What-Ifs'

Google AI Releases Multi-Token Prediction (MTP) Drafters for Gemma 4: Delivering Up to 3x Faster Inference Without Quality Loss

When Claude Hallucinates in Court: The Latham & Watkins Incident and What It Means for Attorney Liability

Solidity LM surpasses Opus

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer