Bulding my own Diffusion Language Model from scratch was easier than I thought [P]

Reddit r/MachineLearning / 4/22/2026

💬 OpinionDeveloper Stack & InfrastructureSignals & Early TrendsTools & Practical UsageModels & Research

共有:

Key Points

The author set out to implement a diffusion language model from scratch without relying on AI-generated code, partly due to frequent use of Claude Code recently.
They trained a small model for a few hours on a MacBook Air M2 using the tiny Shakespeare dataset, prompting it with “to be,” and obtained partial, imperfect text generation.
The resulting model has roughly 7.5M parameters and a vocabulary size of 66, though the author notes they likely did not train long enough to reach high-quality outputs.
The post frames the project as a way to demystify core NLP terms and concepts such as tokenizers, encoders/decoders, and discrete diffusion, and encourages others to try similar builds.
The author provides the implementation code publicly via a GitHub repository for readers who want to replicate or learn from it.

Since I felt like I was relying on Claude Code a lot recently, I wanted to see how hard it is to implement a diffusion language model from scratch without the help of AI-Generated code. So I built one while waiting for the training for my master's thesis.

This is what I got after a few hours of training on my MacBook Air M2. I trained on the tiny Shakespeare dataset from Karpathy and prompted "to be, "

To be, fo hend! First her sense ountier to Jupits, be horse.

Words of wisdom! The model has around 7.5M Params and vocabulary size is 66 (65 chars + [MASK]. I definitely did not train long enough, but I ran out of time for this one.

Projects like these help me make sense of big scary words like (discrete) diffusion, encoder, decoder, tokenizer. Maybe this encourages someone :)

Check out the code here if you're interested: https://github.com/Encrux/simple_dlm

Thanks for reading! Be horse.

submitted by /u/Encrux615
[link] [comments]

Black Hat USA

AI Business

Free AI Detection app designed specifically for Social Media posts

Reddit r/artificial

Why Your Production LLM Prompt Keeps Failing (And How to Diagnose It in 4 Steps)

Dev.to

Explainable Causal Reinforcement Learning for satellite anomaly response operations under multi-jurisdictional compliance

Dev.to

How to Build AI-Powered Automation Workflows for Small Businesses — A Developer'

Dev.to

Bulding my own Diffusion Language Model from scratch was easier than I thought [P]

Key Points

Related Articles

Black Hat USA

Free AI Detection app designed specifically for Social Media posts

Why Your Production LLM Prompt Keeps Failing (And How to Diagnose It in 4 Steps)

Explainable Causal Reinforcement Learning for satellite anomaly response operations under multi-jurisdictional compliance

How to Build AI-Powered Automation Workflows for Small Businesses — A Developer'

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer