AI Navigate

3D Fourier-based Global Feature Extraction for Hyperspectral Image Classification

arXiv cs.CV / 3/18/2026

📰 NewsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper introduces HGFNet, a Hybrid GFNet architecture that combines localized 3D convolutional feature extraction with frequency-domain global filtering for hyperspectral image classification.
  • It proposes three frequency transforms (Spectral Fourier Transform, Spatial Fourier Transform, and Spatial-Spatial Fourier Transform) to model spectral and spatial dependencies comprehensively.
  • The architecture uses 3D convolutional layers for local spatial-spectral structures and Fourier-based modules for long-range dependencies and noise suppression.
  • To handle class imbalance in hyperspectral data, it introduces Adaptive Focal Loss that dynamically adjusts class-wise focusing and weighting.
  • The approach also addresses scalability concerns of transformer-based models by leveraging FFT-based global filtering as an efficient alternative.

Abstract

Hyperspectral image classification (HSIC) has been significantly advanced by deep learning methods that exploit rich spatial-spectral correlations. However, existing approaches still face fundamental limitations: transformer-based models suffer from poor scalability due to the quadratic complexity of self-attention, while recent Fourier transform-based methods typically rely on 2D spatial FFTs and largely ignore critical inter-band spectral dependencies inherent to hyperspectral data. To address these challenges, we propose Hybrid GFNet (HGFNet), a novel architecture that integrates localized 3D convolutional feature extraction with frequency-domain global filtering via GFNet-style blocks for efficient and robust spatial-spectral representation learning. HGFNet introduces three complementary frequency transforms tailored to hyperspectral imagery: Spectral Fourier Transform (a 1D FFT along the spectral axis), Spatial Fourier Transform (a 2D FFT over spatial dimensions), and Spatial-Spatial Fourier Transform (a 3D FFT jointly over spectral and spatial dimensions), enabling comprehensive and high-dimensional frequency modeling. The 3D convolutional layers capture fine-grained local spatial-spectral structures, while the Fourier-based global filtering modules efficiently model long-range dependencies and suppress noise. To further mitigate the severe class imbalance commonly observed in HSIC, HGFNet incorporates an Adaptive Focal Loss (AFL) that dynamically adjusts class-wise focusing and weighting, improving discrimination for underrepresented classes.