Devstral-Small-2-24B fine-tuned on Claude 4.6 Opus reasoning traces [GGUF Q4+Q5]

Reddit r/LocalLLaMA / 3/24/2026

💬 OpinionSignals & Early TrendsTools & Practical UsageModels & Research

Key Points

  • A community member reports fine-tuning Devstral-Small-2-24B on 2,322 Claude 4.6 Opus reasoning traces to make the model output explicit chain-of-thought before writing code.
  • The model was released on Hugging Face as GGUF quantized weights (Q4_K_M and Q5_K_M, with Q5_K_M recommended) plus an associated LoRA adapter for self-merging.
  • Training used Unsloth with QLoRA (r=16) on an RTX 3090 24GB, with the end-of-epoch-2 checkpoint (~1200 steps) performing better than completing epoch 3.
  • A key technical hurdle was that Devstral-Small-2-24B is a VLM (Pixtral vision encoder), requiring extraction of its Ministral3 language layers into a standalone text-only model before training text-only behaviors.
  • The training dataset consisted of filtered Opus reasoning traces (limited to <20k characters each) from nohurry/Opus-4.6-Reasoning-3000x-filtered.

I fine-tuned Devstral-Small-2-24B on 2,322 Claude 4.6 Opus <think>...</think>
reasoning traces to give it explicit chain-of-thought before writing code.

**Model:** https://huggingface.co/adamjen/Devstral-Small-2-24B-Opus-Reasoning

**Files available:**
- Q4_K_M GGUF (14.3GB)
- Q5_K_M GGUF (16.8GB) ← recommended
- LoRA adapter (370MB) for merging yourself

**Hardware used:** RTX 3090 24GB
**Framework:** Unsloth + QLoRA (r=16)
**Checkpoint:** End of epoch 2 (~1200 steps) — better generalisation than full epoch 3

The main challenge was that Devstral is a VLM (Pixtral vision encoder) which
made direct text-only training on 24GB impossible. Had to extract the Ministral3
language layers into a standalone text-only model first. Full write-up coming on
my blog.

Happy to answer questions about the training process.

Training data: nohurry/Opus-4.6-Reasoning-3000x-filtered — 2,322 samples of Claude 4.6 Opus reasoning traces,
filtered to <20k chars.

submitted by /u/admajic
[link] [comments]