I remember a year or so ago when DeepSeek R1 came out and it was pretty quickly distilled into Llama 3 8b and Qwen 2.5 (?) 7b. Why don’t we see more distilled models? How expensive is it? How many tokens or prompts does it take?
[link] [comments]
Reddit r/LocalLLaMA / 5/9/2026
I remember a year or so ago when DeepSeek R1 came out and it was pretty quickly distilled into Llama 3 8b and Qwen 2.5 (?) 7b. Why don’t we see more distilled models? How expensive is it? How many tokens or prompts does it take?
The Batch
Dev.to
Dev.to
Dev.to

VentureBeat