Noise-aware few-shot learning through bi-directional multi-view prompt alignment

arXiv cs.CV / 3/13/2026

📰 NewsModels & Research

共有:

Key Points

NA-MVP introduces noise-aware few-shot learning through bi-directional multi-view prompt alignment to improve robustness of vision-language models under noisy supervision.
The approach uses unbalanced optimal transport to achieve fine-grained patch-to-prompt correspondence and suppress unreliable regions.
It features a bi-directional prompt design that captures complementary clean-oriented and noise-aware cues to emphasize stable semantics.
An alignment-guided selective refinement strategy uses optimal transport to correct only mislabeled samples, with experiments on synthetic and real-world noisy benchmarks showing state-of-the-art improvements.

Abstract

Vision-language models offer strong few-shot capability through prompt tuning but remain vulnerable to noisy labels, which can corrupt prompts and degrade cross-modal alignment. Existing approaches struggle because they often lack the ability to model fine-grained semantic cues and to adaptively separate clean from noisy signals. To address these challenges, we propose NA-MVP, a framework for Noise-Aware few-shot learning through bi-directional Multi-View Prompt alignment. NA-MVP is built upon a key conceptual shift: robust prompt learning requires moving from global matching to region-aware alignment that explicitly distinguishes clean cues from noisy ones. To realize this, NA-MVP employs (1) multi-view prompts combined with unbalanced optimal transport to achieve fine-grained patch-to-prompt correspondence while suppressing unreliable regions; (2) a bi-directional prompt design that captures complementary clean-oriented and noise-aware cues, enabling the model to focus on stable semantics; and (3) an alignment-guided selective refinement strategy that uses optimal transport to correct only mislabeled samples while retaining reliable data. Experiments on synthetic and real-world noisy benchmarks demonstrate that NA-MVP consistently outperforms state-of-the-art baselines, confirming its effectiveness in enabling robust few-shot learning under noisy supervision.

Jeff Bezos reportedly wants $100 billion to buy and transform old manufacturing firms with AI

TechCrunch

[R] Weekly digest: arXiv AI security papers translated for practitioners -- Cascade (cross-stack CVE+Rowhammer attacks on compound AI), LAMLAD (dual-LLM adversarial ML, 97% evasion), OpenClaw (4 vuln classes in agent frameworks)

Reddit r/MachineLearning

My Experience with Qwen 3.5 35B

Reddit r/LocalLLaMA

Cursor’s new coding model Composer 2 is here: It beats Claude Opus 4.6 but still trails GPT-5.4

VentureBeat

Qwen 3.5 122B completely falls apart at ~ 100K context

Reddit r/LocalLLaMA

Noise-aware few-shot learning through bi-directional multi-view prompt alignment

Key Points

Abstract

Related Articles

Jeff Bezos reportedly wants $100 billion to buy and transform old manufacturing firms with AI

[R] Weekly digest: arXiv AI security papers translated for practitioners -- Cascade (cross-stack CVE+Rowhammer attacks on compound AI), LAMLAD (dual-LLM adversarial ML, 97% evasion), OpenClaw (4 vuln classes in agent frameworks)

My Experience with Qwen 3.5 35B

Cursor’s new coding model Composer 2 is here: It beats Claude Opus 4.6 but still trails GPT-5.4

Qwen 3.5 122B completely falls apart at ~ 100K context

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer