OFA-Diffusion Compression: Compressing Diffusion Model in One-Shot Manner

arXiv cs.CV / 4/15/2026

📰 NewsIdeas & Deep AnalysisModels & Research

共有:

Key Points

The paper proposes OFA-Diffusion Compression, a once-for-all framework to compress diffusion probabilistic models into multiple subnetworks with different compute sizes from a single one-shot training process.
It addresses deployment across heterogeneous devices by avoiding repeated compression training runs that would otherwise be required for different resource constraints.
To reduce optimization slowdown caused by an excessively large candidate subnetwork space, the method restricts subnetworks to a predefined set of parameter sizes and allocates channels gradually based on their importance.
A reweighting strategy is introduced to balance training/optimization across the different subnetworks so they all achieve satisfactory performance.
Experiments indicate the approach achieves lower training overhead while producing compressed diffusion models for multiple target sizes.

Abstract

The Diffusion Probabilistic Model (DPM) achieves remarkable performance in image generation, while its increasing parameter size and computational overhead hinder its deployment in practical applications. To improve this, the existing literature focuses on obtaining a smaller model with a fixed architecture through model compression. However, in practice, DPMs usually need to be deployed on various devices with different resource constraints, which leads to multiple compression processes, incurring significant overhead for repeated training. To obviate this, we propose a once-for-all (OFA) compression framework for DPMs that yields different subnetworks with various computations in a one-shot training manner. The existing OFA framework typically involves massive subnetworks with different parameter sizes, while such a huge candidate space slows the optimization. Thus, we propose to restrict the candidate subnetworks with a certain set of parameter sizes, where each size corresponds to a specific subnetwork. Specifically, to construct each subnetwork with a given size, we gradually allocate the maintained channels by their importance. Furthermore, we propose a reweighting strategy to balance the optimization process of different subnetworks. Experimental results show that our approach can produce compressed DPMs for various sizes with significantly lower training overhead while achieving satisfactory performance.

From Theory to Reality: Why Most AI Agent Projects Fail (And How Mine Did Too)

Dev.to

GPT-5.4-Cyber: OpenAI's Game-Changer for AI Security and Defensive AI

Dev.to

Building Digital Souls: The Brutal Reality of Creating AI That Understands You Like Nobody Else

Dev.to

Local LLM Beginner’s Guide (Mac - Apple Silicon)

Reddit r/artificial

Is Your Skill Actually Good? Systematically Validating Agent Skills with Evals

Dev.to

OFA-Diffusion Compression: Compressing Diffusion Model in One-Shot Manner

Key Points

Abstract

Related Articles

From Theory to Reality: Why Most AI Agent Projects Fail (And How Mine Did Too)

GPT-5.4-Cyber: OpenAI's Game-Changer for AI Security and Defensive AI

Building Digital Souls: The Brutal Reality of Creating AI That Understands You Like Nobody Else

Local LLM Beginner’s Guide (Mac - Apple Silicon)

Is Your Skill Actually Good? Systematically Validating Agent Skills with Evals

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer