How difficult is distilling?

Reddit r/LocalLLaMA / 5/9/2026

💬 OpinionSignals & Early TrendsIdeas & Deep Analysis

共有:

Key Points

The article (a Reddit question) asks why more distilled models are not widely seen after DeepSeek R1 was quickly distilled into smaller models such as Llama 3 8B and Qwen 2.5 ~7B.
It specifically probes how difficult distillation is, including the practical effort required to produce a smaller student model from a larger teacher.
It asks about the cost implications of distillation, focusing on compute expenses and overall feasibility for practitioners.
It also seeks quantitative guidance on resource requirements, such as how many tokens or prompts are needed to achieve useful distillation results.

I remember a year or so ago when DeepSeek R1 came out and it was pretty quickly distilled into Llama 3 8b and Qwen 2.5 (?) 7b. Why don’t we see more distilled models? How expensive is it? How many tokens or prompts does it take?

submitted by /u/GreedyWorking1499
[link] [comments]

Seedance Makes A Splash, Nvidia's AI-Guided Chip Designs, Helping Robots Not Forget

The Batch

The Semantic Airgap: Why "Hinglish" is the Ultimate Zero-Day for Voice Agents

Dev.to

Build an AI-Powered Money Printing Machine

Dev.to

A protocol for auditing AI agent harnesses

Dev.to

Anthropic says it hit a $30 billion revenue run rate after 'crazy' 80x growth

VentureBeat

How difficult is distilling?

Key Points

Related Articles

Seedance Makes A Splash, Nvidia's AI-Guided Chip Designs, Helping Robots Not Forget

The Semantic Airgap: Why "Hinglish" is the Ultimate Zero-Day for Voice Agents

Build an AI-Powered Money Printing Machine

A protocol for auditing AI agent harnesses

Anthropic says it hit a $30 billion revenue run rate after 'crazy' 80x growth

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer