SRGAN-CKAN: Expressive Super-Resolution with Nonlinear Functional Operators under Minimal Resources

arXiv cs.CV / 5/5/2026

📰 NewsIdeas & Deep AnalysisModels & Research

共有:

Key Points

The paper addresses Single-Image Super-Resolution (SISR), an ill-posed inverse problem where large upscaling factors cause high-frequency details to degrade severely.
It proposes SRGAN-CKAN, a hybrid framework that embeds Convolutional Kolmogorov–Arnold Networks (CKAN) into adversarial learning, reformulating convolution as nonlinear, patch-based transformations.
Instead of linear local mappings, the method uses spline-based functional representations to better model complex local structures and high-frequency textures with limited compute.
Experiments report improved perceptual quality while maintaining reconstruction fidelity, achieving a favorable trade-off between distortion-based and perceptual metrics under constrained computational settings.
The authors position the approach as a complementary direction to transformer- or diffusion-heavy methods by boosting the expressiveness of local operators rather than relying on globally intensive architectures.

Abstract

Single-Image Super-Resolution (SISR) aims to reconstruct a High-Resolution (HR) image from a Low-Resolution (LR) observation, a fundamentally ill-posed problem where high-frequency details are severely degraded at large upscaling factors. Recent advances have been driven by transformer-based architectures and diffusion models improve global context modeling and perceptual quality at the cost of increased computational complexity. In contrast, this work focuses on enhancing the expressivity of local operators under minimal resources. We propose SRGAN--CKAN, a hybrid super-resolution framework that integrates Convolutional Kolmogorov--Arnold Networks (CKAN) into an adversarial learning setting reformulating convolution as a nonlinear patch-based transformation. The proposed operator replaces linear local mappings with spline-based functional representations, allowing expressive modeling of complex local structures and high-frequency textures using minimal hardware resources. Experimental results demonstrate that the proposed approach improves perceptual quality while preserving reconstruction fidelity, achieving a favorable balance between distortion-based and perceptual metrics. These results are obtained under constrained computational settings, highlighting the efficiency of the proposed formulation. Overall, this work introduces a complementary direction to existing approaches by improving the representational power of local transformations, providing an efficient and scalable alternative to globally intensive architectures.