Real Image Denoising with Knowledge Distillation for High-Performance Mobile NPUs

arXiv cs.CV / 5/6/2026

📰 NewsDeveloper Stack & InfrastructureSignals & Early TrendsTools & Practical UsageModels & Research

共有:

Key Points

The paper proposes an NPU-aware hardware–algorithm co-design method for real-world image denoising on mobile NPUs, addressing operator incompatibility and memory-access overhead.
It uses knowledge distillation from a high-capacity teacher to train a lightweight “student” model (LiteDenoiseNet) optimized for tiled-memory SoC architectures.
By restricting the network to NPU-native primitives (e.g., 3x3 convolutions, ReLU, nearest-neighbor upsampling) and applying progressive context expansion up to 1024x1024 crops, it achieves strong benchmark PSNR/SSIM scores at full resolution.
Runtime results under a standardized Full HD protocol show 34.0 ms on MediaTek Dimensity 9500 and 46.1 ms on Qualcomm Snapdragon 8 Elite, with an “Inference Inversion” effect where NPU-compatible design yields up to 3.88× faster dedicated NPU execution than the integrated mobile GPU.
The 1.96M-parameter student recovers 99.8% of the teacher’s quality using high-alpha knowledge distillation (alpha=0.9), reaching a 21.2× parameter reduction while reducing the PSNR gap to just 0.05 dB; training stats and the model are released via an NN Dataset repository.

Abstract

While deep-learning-based image restoration has achieved unprecedented fidelity, deployment on mobile Neural Processing Units (NPUs) remains bottlenecked by operator incompatibility and memory-access overhead. We propose an NPU-aware hardware-algorithm co-design approach for real-world image denoising on mobile NPUs. Our approach employs a high-capacity teacher to supervise a lightweight student network specifically designed to leverage the tiled-memory architectures of modern mobile SoCs. By prioritizing NPU-native primitives -- standard 3x3 convolutions, ReLU activations, and nearest-neighbor upsampling -- and employing a progressive context expansion strategy (up to 1024x1024 crops), the model achieves 37.66 dB PSNR / 0.9278 SSIM on the validation benchmark and 37.58 dB PSNR / 0.9098 SSIM on the held-out test benchmark at full resolution (2432x3200) in the Mobile AI 2026 challenge. Following the official challenge rules, the inference runtime is measured under a standardized Full HD (1088x1920) protocol, where it runs in 34.0 ms on the MediaTek Dimensity 9500 and 46.1 ms on the Qualcomm Snapdragon 8 Elite NPU. We further reveal an "Inference Inversion" effect, where strict adherence to NPU-compatible operations enables dedicated NPU execution up to 3.88x faster than the integrated mobile GPU. The 1.96M-parameter student recovers 99.8% of the teacher's restoration quality via high-alpha knowledge distillation (alpha = 0.9), achieving a 21.2x parameter reduction while closing the PSNR gap from 1.63 dB to only 0.05 dB. These results establish hardware-aware distillation as an effective strategy for unifying high-fidelity denoising with practical deployment across diverse mobile NPU architectures. The proposed lightweight student model (LiteDenoiseNet) and its training statistics are provided in the NN Dataset, available at https://github.com/ABrain-One/NN-Dataset.

Black Hat USA

AI Business

Antwerp startup Maurice & Nora raises €1M to address rising care demand

Tech.eu

Top 10 Free AI Tools for Students in 2026: The Ultimate Study Guide

Dev.to

PaioClaw Review: What You Actually Get for $15/mo vs DIY OpenClaw

Dev.to

PaioClaw Review: What You Actually Get for $15/mo vs DIY OpenClaw

Dev.to

Real Image Denoising with Knowledge Distillation for High-Performance Mobile NPUs

Key Points

Abstract

Related Articles

Black Hat USA

Antwerp startup Maurice & Nora raises €1M to address rising care demand

Top 10 Free AI Tools for Students in 2026: The Ultimate Study Guide

PaioClaw Review: What You Actually Get for $15/mo vs DIY OpenClaw

PaioClaw Review: What You Actually Get for $15/mo vs DIY OpenClaw

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer

Key Points

Abstract

Related Articles

Black Hat USA

Antwerp startup Maurice &amp; Nora raises €1M to address rising care demand

Top 10 Free AI Tools for Students in 2026: The Ultimate Study Guide

PaioClaw Review: What You Actually Get for $15/mo vs DIY OpenClaw

PaioClaw Review: What You Actually Get for $15/mo vs DIY OpenClaw

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer

Antwerp startup Maurice & Nora raises €1M to address rising care demand