Hyperspectral Image Classification via Efficient Global Spectral Supertoken Clustering

arXiv cs.CV / 5/1/2026

📰 NewsDeveloper Stack & InfrastructureTools & Practical UsageModels & Research

Key Points

  • The paper addresses a key limitation of hyperspectral image classification with superpixel methods: clustering produces spatial regions, but many classifiers still predict at the pixel level, weakening region-level consistency and boundary alignment.
  • It introduces DSCC, an end-to-end dual-stage framework that decouples clustering from classification by forming boundary-preserving “spectral supertokens” via spectral-similarity and spatial-proximity constraints, then performing token-level prediction.
  • DSCC uses multi-criteria image-level feature distance, locality-aware assignment regularization, and density-isolation-based center selection to generate representative and well-separated cluster centers while reducing redundancy and improving robustness to scale variation.
  • To handle mixed land-cover compositions inside a token, it proposes a soft-label scheme that encodes class proportions, improving robustness for mixed-class supertokens.
  • The method achieves CF1 = 0.728 at 197.75 FPS on the WHU-OHS dataset, and the authors report code availability for further reproduction and use.

Abstract

Hyperspectral image classification demands spatially coherent predictions and precise boundary delineation. Yet prevailing superpixel-based methods face an inherent contradiction: clustering aggregates similar pixels into regions, but the subsequent classifier operates pixel-wise, undermining regional consistency. Consequently, existing approaches do not guarantee region-level, boundary-aligned classification. To address this limitation, we propose the Dual-stage Spectrum-Constrained Clustering-based Classifier (DSCC), an end-to-end framework that explicitly decouples clustering from classification by first grouping spectral similar and spatially proximate pixels into spectral supertokens and then performing token-level prediction. At its core, DSCC computes an image-level multi-criteria feature distance between pixels and centers, followed by a locality-aware assignment regularization, enabling the generation of boundary-preserving spectral supertokens. A density-isolation based center selection further yields representative, well-separated centers, reducing redundancy and improving robustness to scale variation. To accommodate mixed land-cover compositions within each token, we introduce a soft-label scheme that encodes class proportions and improves robustness for mixed-class tokens. DSCC attains a CF1 of 0.728 at 197.75 FPS on the WHU-OHS dataset, offering a superior accuracy-efficiency trade-off compared with state-of-the-art methods. Extensive experiments further validate the effectiveness and generality of the proposed dual-stage paradigm for hyperspectral image classification. The source code is available at https://github.com/laprf/DSCC.