A Tale of Two Temperatures: Simple, Efficient, and Diverse Sampling from Diffusion Language Models

arXiv cs.LG / 4/14/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

共有:

Key Points

Diffusion LLM researchers propose increasing sample diversity by applying “softened/tempered” versions of existing confidence-based remasking heuristics rather than only optimizing the speed–quality tradeoff.
The work provides an idealized formal model of “fork tokens” to analyze how remasking affects expected entropy at decision points where sampling branches.
Experiments show tempered heuristics help close the exploration gap (higher pass@k) compared with both confidence-based and autoregressive sampling while outperforming them under matched compute cost (pass@NFE).
The paper studies how the diversity gains translate into improved behavior during downstream post-training and test-time compute scaling, supporting the claim that efficient and diverse sampling is feasible.
Overall, the method is designed to be simple to implement while retaining computational efficiency, aiming to make diffusion language model sampling more robust for varied outputs.

Abstract

Much work has been done on designing fast and accurate sampling for diffusion language models (dLLMs). However, these efforts have largely focused on the tradeoff between speed and quality of individual samples; how to additionally ensure diversity across samples remains less well understood. In this work, we show that diversity can be increased by using softened, tempered versions of familiar confidence-based remasking heuristics, retaining their computational benefits and offering simple implementations. We motivate this approach by introducing an idealized formal model of fork tokens and studying the impact of remasking on the expected entropy at the forks. Empirically, the proposed tempered heuristics close the exploration gap (pass@k) between existing confidence-based and autoregressive sampling, hence outperforming both when controlling for cost (pass@NFE). We further study how the increase in diversity translates to downstream post-training and test-time compute scaling. Overall, our findings demonstrate that simple, efficient, and diverse sampling from dLLMs is possible.

Don't forget, there is more than forgetting: new metrics for Continual Learning

Dev.to

Microsoft MAI-Image-2-Efficient Review 2026: The AI Image Model Built for Production Scale

Dev.to

Bit of a strange question?

Reddit r/artificial

One URL for Your AI Agent: HTML, JSON, Markdown, and an A2A Card

Dev.to

One URL for Your AI Agent: HTML, JSON, Markdown, and an A2A Card

Dev.to

A Tale of Two Temperatures: Simple, Efficient, and Diverse Sampling from Diffusion Language Models

Key Points

Abstract

Related Articles

Don't forget, there is more than forgetting: new metrics for Continual Learning

Microsoft MAI-Image-2-Efficient Review 2026: The AI Image Model Built for Production Scale

Bit of a strange question?

One URL for Your AI Agent: HTML, JSON, Markdown, and an A2A Card

One URL for Your AI Agent: HTML, JSON, Markdown, and an A2A Card

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer