Fast Model-guided Instance-wise Adaptation Framework for Real-world Pansharpening with Fidelity Constraints

arXiv cs.CV / 4/13/2026

💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research

Key Points

  • FMG-Pan (FMG-Pan) is proposed as a fast, generalizable model-guided instance-wise adaptation framework for real-world pansharpening that targets poor cross-distribution generalization seen in many DL approaches.
  • The method uses a pretrained model to guide a lightweight adaptive network, jointly optimizing with spectral and physical fidelity constraints to preserve both spectral and spatial information.
  • A novel physical fidelity term is introduced to improve spatial detail preservation, addressing limitations of prior zero-shot methods that often produce weaker fusion quality.
  • Experiments on real-world datasets (intra- and cross-sensor) report state-of-the-art performance, including a reported WorldView-3 runtime of training+inference for 512x512x8 within ~3 seconds on an RTX 3090.
  • By combining rapid convergence/inference with cross-sensor generality, FMG-Pan is positioned as more suitable for practical deployment than existing zero-shot pansharpening methods with higher overhead.

Abstract

Pansharpening aims to generate high-resolution multispectral (HRMS) images by fusing low-resolution multispectral (LRMS) and high-resolution panchromatic (PAN) images while preserving both spectral and spatial information. Although deep learning (DL)-based pansharpening methods achieve impressive performance, they require high training cost and large datasets, and often degrade when the test distribution differs from training, limiting generalization. Recent zero-shot methods, trained on a single PAN/LRMS pair, offer strong generalization but suffer from limited fusion quality, high computational overhead, and slow convergence. To address these issues, we propose FMG-Pan, a fast and generalizable model-guided instance-wise adaptation framework for real-world pansharpening, achieving both cross-sensor generality and rapid training-inference. The framework leverages a pretrained model to guide a lightweight adaptive network through joint optimization with spectral and physical fidelity constraints. We further design a novel physical fidelity term to enhance spatial detail preservation. Extensive experiments on real-world datasets under both intra- and cross-sensor settings demonstrate state-of-the-art performance. On the WorldView-3 dataset, FMG-Pan completes training and inference for a 512x512x8 image within 3 seconds on an RTX 3090 GPU, significantly faster than existing zero-shot methods, making it suitable for practical deployment.