Harmonic-9B - Two-stage Qwen3.5-9B fine-tune (Stage 2 still training)

Reddit r/LocalLLaMA / 4/5/2026

📰 NewsSignals & Early TrendsModels & Research

Key Points

  • The author uploaded “Harmonic-9B,” a two-stage fine-tune of Qwen3.5-9B designed specifically for agent use, with Stage 1 complete and Stage 2 still actively training.
  • Stage 2 focuses on improving tool-calling behavior by combining structured reasoning with reliable agent actions, while aiming to keep normal chat from feeling stiff or overly verbose.
  • The Stage 2 training dataset is an open-sourced, filtered version of Hermes agent traces, with reported gains including self-correction (6%→63%), verification (26%→96%), and improved thinking depth (+40%).
  • The author reports that filtered Stage 2 data yields 100% valid JSON/tool calls and provides GGUF quant downloads, but says formal benchmarks are pending until Stage 2 finishes.
  • Feedback is requested on how Harmonic-9B performs in agent harnesses such as OpenClaw, LangGraph, and ReAct, with results expected after real agent evaluations.

Hey r/LocalLLaMA,

I just uploaded Harmonic-9B, my latest Qwen3.5-9B fine-tune aimed at agent use.

Current status:

• Stage 1 (heavy reasoning training) is complete

• Stage 2 (light tool-calling / agent fine-tune) is still training right now

The plan is to combine strong structured reasoning with clean, reliable tool use while trying to avoid making normal chat feel stiff or overly verbose.

Filtered dataset for Stage 2: I open-sourced the filtered version of the Hermes agent traces I’m using for the second stage:

https://huggingface.co/datasets/DJLougen/hermes-agent-traces-filtered

Key improvements after filtering:

• Self-correction: 6% → 63%

• Verification steps: 26% → 96%

• Thinking depth: +40%

• Valid JSON/tool calls: 100%

GGUF quants are already available here:

https://huggingface.co/DJLougen/Harmonic-9B-GGUF

I haven’t run proper benchmarks yet because Stage 2 is still training. Early checks on the Stage 1 checkpoint looked good for reasoning structure. Will share numbers once Stage 2 finishes and I can do real agent evals.

If you give it a spin, I’d appreciate any feedback — especially how it behaves in agent harnesses (OpenClaw, LangGraph, ReAct, etc.).

This is part of my ongoing work on high-signal data curation and staged fine-tuning. More updates coming soon.

submitted by /u/Crampappydime
[link] [comments]