Adaptive receptive field-based spatial-frequency feature reconstruction network for few-shot fine-grained image classification

arXiv cs.CV / 4/21/2026

📰 NewsModels & Research

Key Points

  • The paper identifies a key limitation in existing feature-reconstruction methods for few-shot fine-grained image classification: selecting an appropriate receptive field size to extract spatial and frequency descriptors from diverse inputs.
  • It proposes ARF-SFR-Net, which adaptively determines receptive field sizes to obtain spatial and frequency features, then fuses them for both feature reconstruction and FSFGIC performance.
  • The method is designed to be easily integrated into episodic training, enabling end-to-end training from scratch.
  • Experiments across multiple FSFGIC benchmarks show that ARF-SFR-Net outperforms prior state-of-the-art approaches, and the authors provide publicly available code on GitHub.

Abstract

Feature reconstruction techniques are widely applied for few-shot fine-grained image classification (FSFGIC). Our research indicates that one of the main challenges facing existing feature-based FSFGIC methods is how to choose the size of the receptive field to extract feature descriptors (including spatial and frequency feature descriptors) from different category input images, thereby better performing the FSFGIC tasks. To address this, an adaptive receptive field-based spatial-frequency feature reconstruction network (ARF-SFR-Net) is proposed. The designed ARF-SFR-Net has the capability to adaptively determine receptive field sizes for obtaining spatial and frequency features, and effectively fuse them for reconstruction and FSFGIC tasks. The designed ARF-SFR-Net can be easily embedded into a given episodic training mechanism for end-to-end training from scratch. Extensive experiments on multiple FSFGIC benchmarks demonstrate the effectiveness and superiority of the proposed ARF-SFR-Net over state-of-the-art approaches. The code is available at: https://github.com/ICL-SUST/ARF-SFR-Net.git.