Adaptive receptive field-based spatial-frequency feature reconstruction network for few-shot fine-grained image classification

arXiv cs.CV / 4/21/2026

📰 NewsModels & Research

共有:

Key Points

The paper identifies a key limitation in existing feature-reconstruction methods for few-shot fine-grained image classification: selecting an appropriate receptive field size to extract spatial and frequency descriptors from diverse inputs.
It proposes ARF-SFR-Net, which adaptively determines receptive field sizes to obtain spatial and frequency features, then fuses them for both feature reconstruction and FSFGIC performance.
The method is designed to be easily integrated into episodic training, enabling end-to-end training from scratch.
Experiments across multiple FSFGIC benchmarks show that ARF-SFR-Net outperforms prior state-of-the-art approaches, and the authors provide publicly available code on GitHub.

Abstract

Feature reconstruction techniques are widely applied for few-shot fine-grained image classification (FSFGIC). Our research indicates that one of the main challenges facing existing feature-based FSFGIC methods is how to choose the size of the receptive field to extract feature descriptors (including spatial and frequency feature descriptors) from different category input images, thereby better performing the FSFGIC tasks. To address this, an adaptive receptive field-based spatial-frequency feature reconstruction network (ARF-SFR-Net) is proposed. The designed ARF-SFR-Net has the capability to adaptively determine receptive field sizes for obtaining spatial and frequency features, and effectively fuse them for reconstruction and FSFGIC tasks. The designed ARF-SFR-Net can be easily embedded into a given episodic training mechanism for end-to-end training from scratch. Extensive experiments on multiple FSFGIC benchmarks demonstrate the effectiveness and superiority of the proposed ARF-SFR-Net over state-of-the-art approaches. The code is available at: https://github.com/ICL-SUST/ARF-SFR-Net.git.

The ULTIMATE Guide to AI Voice Cloning: RVC WebUI (Zero to Hero)

Dev.to

Kiwi-chan Devlog #007: The Audit Never Sleeps (and Neither Does My GPU)

Dev.to

Second-Order Injection: Attacking the Evaluator in LLM Safety Monitors

Dev.to

Note the new recommended sampling parameters for Qwen3.6 27B

Reddit r/LocalLLaMA

Qwen3.6 35B + the right coding scaffold got my local setup to 9/10 on real Go tasks

Reddit r/LocalLLaMA

Adaptive receptive field-based spatial-frequency feature reconstruction network for few-shot fine-grained image classification

Key Points

Abstract

Related Articles

The ULTIMATE Guide to AI Voice Cloning: RVC WebUI (Zero to Hero)

Kiwi-chan Devlog #007: The Audit Never Sleeps (and Neither Does My GPU)

Second-Order Injection: Attacking the Evaluator in LLM Safety Monitors

Note the new recommended sampling parameters for Qwen3.6 27B

Qwen3.6 35B + the right coding scaffold got my local setup to 9/10 on real Go tasks

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer