AI Navigate

HELM: Hierarchical and Explicit Label Modeling with Graph Learning for Multi-Label Image Classification

arXiv cs.CV / 3/13/2026

📰 NewsIdeas & Deep AnalysisModels & Research

Key Points

  • HELM introduces a hierarchical and explicit label modeling framework for multi-label image classification that addresses complex label dependencies in remote sensing, including multi-path hierarchies and semi-supervised learning with unlabeled data.
  • The method uses hierarchy-specific class tokens within a Vision Transformer to capture nuanced interactions among labels.
  • A graph convolutional network explicitly encodes the hierarchical structure to generate hierarchy-aware embeddings.
  • A self-supervised branch enables the model to exploit unlabeled imagery, improving performance in low-label scenarios.
  • On four RSI datasets (UCM, AID, DFC-15, MLRSNet), HELM achieves state-of-the-art results in both supervised and semi-supervised settings, with particular strength when labels are scarce.

Abstract

Hierarchical multi-label classification (HMLC) is essential for modeling complex label dependencies in remote sensing. Existing methods, however, struggle with multi-path hierarchies where instances belong to multiple branches, and they rarely exploit unlabeled data. We introduce HELM (\textit{Hierarchical and Explicit Label Modeling}), a novel framework that overcomes these limitations. HELM: (i) uses hierarchy-specific class tokens within a Vision Transformer to capture nuanced label interactions; (ii) employs graph convolutional networks to explicitly encode the hierarchical structure and generate hierarchy-aware embeddings; and (iii) integrates a self-supervised branch to effectively leverage unlabeled imagery. We perform a comprehensive evaluation on four remote sensing image (RSI) datasets (UCM, AID, DFC-15, MLRSNet). HELM achieves state-of-the-art performance, consistently outperforming strong baselines in both supervised and semi-supervised settings, demonstrating particular strength in low-label scenarios.