Revisiting Cross-Attention Mechanisms: Leveraging Beneficial Noise for Domain-Adaptive Learning

arXiv cs.CV / 3/19/2026

📰 NewsIdeas & Deep AnalysisModels & Research

共有:

Key Points

The paper introduces beneficial noise to regularize cross-attention in unsupervised domain adaptation, encouraging the model to ignore style distractions and focus on content.
It proposes the Domain-Adaptive Transformer (DAT) to disentangle domain-shared content from domain-specific style.
It also introduces the Cross-Scale Matching (CSM) module to align features across multiple resolutions while preserving semantic consistency.
DACSM achieves state-of-the-art performance across VisDA-2017, Office-Home, and DomainNet, including a +2.3% improvement over CDTrans on VisDA-2017 and a +5.9% gain on the 'truck' class.
The work demonstrates that combining domain translation, beneficial-noise-enhanced attention, and scale-aware alignment can yield robust, content-consistent representations for cross-domain learning.

Abstract

Unsupervised Domain Adaptation (UDA) seeks to transfer knowledge from a labeled source domain to an unlabeled target domain but often suffers from severe domain and scale gaps that degrade performance. Existing cross-attention-based transformers can align features across domains, yet they struggle to preserve content semantics under large appearance and scale variations. To explicitly address these challenges, we introduce the concept of beneficial noise, which regularizes cross-attention by injecting controlled perturbations, encouraging the model to ignore style distractions and focus on content. We propose the Domain-Adaptive Cross-Scale Matching (DACSM) framework, which consists of a Domain-Adaptive Transformer (DAT) for disentangling domain-shared content from domain-specific style, and a Cross-Scale Matching (CSM) module that adaptively aligns features across multiple resolutions. DAT incorporates beneficial noise into cross-attention, enabling progressive domain translation with enhanced robustness, yielding content-consistent and style-invariant representations. Meanwhile, CSM ensures semantic consistency under scale changes. Extensive experiments on VisDA-2017, Office-Home, and DomainNet demonstrate that DACSM achieves state-of-the-art performance, with up to +2.3% improvement over CDTrans on VisDA-2017. Notably, DACSM achieves a +5.9% gain on the challenging "truck" class of VisDA, evidencing the strength of beneficial noise in handling scale discrepancies. These results highlight the effectiveness of combining domain translation, beneficial-noise-enhanced attention, and scale-aware alignment for robust cross-domain representation learning.

How We Built ScholarNet AI: An AI-Powered Study Platform for Students

Dev.to

Using Notion MCP: Building a Personal AI 'OS' to Claim Back Your Morning

Dev.to

The LiteLLM Attack Exposed a Bigger Problem: Your Vibe-Coded App Probably Has the Same Vulnerabilities

Dev.to

Why Your Claude-Assisted Project Falls Apart After Week 3 (And How to Fix It)

Dev.to

LatentQA: Teaching LLMs to Decode Activations Into Natural Language

arXiv cs.CL

Revisiting Cross-Attention Mechanisms: Leveraging Beneficial Noise for Domain-Adaptive Learning

Key Points

Abstract

Related Articles

How We Built ScholarNet AI: An AI-Powered Study Platform for Students

Using Notion MCP: Building a Personal AI 'OS' to Claim Back Your Morning

The LiteLLM Attack Exposed a Bigger Problem: Your Vibe-Coded App Probably Has the Same Vulnerabilities

Why Your Claude-Assisted Project Falls Apart After Week 3 (And How to Fix It)

LatentQA: Teaching LLMs to Decode Activations Into Natural Language

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer