Remote Sensing Image Super-Resolution for Imbalanced Textures: A Texture-Aware Diffusion Framework

arXiv cs.CV / 4/16/2026

💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper identifies a key problem with applying diffusion-based super-resolution directly to remote sensing: textures are globally stochastic but locally clustered, creating highly imbalanced texture distributions that degrade spatial perception.
  • It introduces TexADiff, which first estimates a Relative Texture Density Map (RTDM) to explicitly represent where texture-rich regions are located in the image.
  • TexADiff uses the RTDM in three coordinated roles: spatial conditioning for the diffusion process, loss modulation to emphasize texture-dense areas, and an adapter that adjusts the sampling schedule dynamically.
  • Experiments show TexADiff achieves superior or competitive quantitative super-resolution metrics and produces more faithful high-frequency details while suppressing texture hallucinations.
  • The improved reconstruction quality translates into better performance on downstream remote-sensing tasks, and the authors provide code on GitHub.

Abstract

Generative diffusion priors have recently achieved state-of-the-art performance in natural image super-resolution, demonstrating a powerful capability to synthesize photorealistic details. However, their direct application to remote sensing image super-resolution (RSISR) reveals significant shortcomings. Unlike natural images, remote sensing images exhibit a unique texture distribution where ground objects are globally stochastic yet locally clustered, leading to highly imbalanced textures. This imbalance severely hinders the model's spatial perception. To address this, we propose TexADiff, a novel framework that begins by estimating a Relative Texture Density Map (RTDM) to represent the texture distribution. TexADiff then leverages this RTDM in three synergistic ways: as an explicit spatial conditioning to guide the diffusion process, as a loss modulation term to prioritize texture-rich regions, and as a dynamic adapter for the sampling schedule. These modifications are designed to endow the model with explicit texture-aware capabilities. Experiments demonstrate that TexADiff achieves superior or competitive quantitative metrics. Furthermore, qualitative results show that our model generates faithful high-frequency details while effectively suppressing texture hallucinations. This improved reconstruction quality also results in significant gains in downstream task performance. The source code of our method can be found at https://github.com/ZezFuture/TexAdiff.