DMGD: Train-Free Dataset Distillation with Semantic-Distribution Matching in Diffusion Models
arXiv cs.CV / 5/6/2026
📰 NewsModels & Research
Key Points
- The paper introduces DMGD (Dual Matching Guided Diffusion), a diffusion-based dataset distillation framework designed to provide effective guidance without additional training or fine-tuning stages.
- It performs Semantic-Distribution Matching using conditional likelihood optimization to achieve semantic alignment without relying on auxiliary classifiers.
- A dynamic guidance mechanism is proposed to increase the diversity of synthetic datasets while preserving semantic consistency with the target data.
- The method also uses optimal transport (OT) to better match the structure of the target distribution, supported by efficient approximations (Distribution Approximate Matching) and staged computation (Greedy Progressive Matching).
- Experiments on ImageNet-Woof, ImageNet-Nette, and ImageNet-1K show that the training-free approach improves over state-of-the-art methods that require extra fine-tuning, with average accuracy gains of 2.1%, 5.4%, and 2.4% respectively.
Related Articles

Google AI Releases Multi-Token Prediction (MTP) Drafters for Gemma 4: Delivering Up to 3x Faster Inference Without Quality Loss
MarkTechPost
Solidity LM surpasses Opus
Reddit r/LocalLLaMA

Quality comparison between Qwen 3.6 27B quantizations (BF16, Q8_0, Q6_K, Q5_K_XL, Q4_K_XL, IQ4_XS, IQ3_XXS,...)
Reddit r/LocalLLaMA

We measured the real cost of running a GPT-5.4 chatbot on live websites
Reddit r/artificial

AI ecosystems in China and US grow apart amid tech war
SCMP Tech