CUE: Concept-Aware Multi-Label Expansion to Mitigate Concept Confusion in Long-Tailed Learning

arXiv cs.CV / 5/5/2026

📰 NewsSignals & Early TrendsIdeas & Deep AnalysisModels & Research

共有:

Key Points

The paper highlights that long-tailed learning suffers not only from class imbalance but also from “concept confusion,” where disrupted relationships between classes hurt inter-class discriminability.
It attributes this issue to the mutual exclusivity assumption of single-label supervision under long-tailed distributions, which suppresses feature sharing among related classes and favors head classes.
To mitigate concept confusion, the authors propose CUE (Concept-aware mUlti-label Expansion), which adds multi-label concept signals to better preserve inter-class relationships.
CUE builds concept sets using instance-level visual cues from zero-shot CLIP and class-level semantic cues generated by an LLM, then trains with separately weighted Binary Logit-Adjustment auxiliary losses alongside the baseline Logit-Adjustment loss.
Experiments on multiple long-tailed benchmarks show that CUE achieves more balanced and stronger performance than recent state-of-the-art approaches, and the code is publicly available.

Abstract

Long-tailed distributions are common in real-world recognition tasks, where a few head classes have many samples while most tail classes have very few. Recently, fine-tuning foundation models for long-tailed learning has gained attention due to their excellent performance. However, most existing methods focus solely on mitigating long-tailed distribution bias while overlooking concept confusion caused by the long-tailed distribution. In this paper, we study this problem and attribute it to the mutual exclusivity of single-label supervision under long-tailed distributions, which suppresses feature sharing among related classes and amplifies the dominance of head classes, leading to disrupted inter-class discriminability. To address this, we propose CUE, Concept-aware mUlti-label Expansion, which introduces multi-label concept signals to preserve disrupted inter-class relationships. Specifically, CUE constructs concept sets by (i) extracting instance-level visual cues from zero-shot CLIP and (ii) generating class-level semantic cues with LLM; the two cues are incorporated via separately weighted Binary Logit-Adjustment (BLA) auxiliary losses and jointly optimized with the baseline Logit-Adjustment (LA) loss. Experiments on several long-tailed benchmarks, CUE achieves balanced and strong performance, surpassing recent state-of-the-art methods. Code is available at: https://github.com/zhangruichi/CUE.