AI Navigate

Qwen3.5 Best Parameters Collection

Reddit r/LocalLLaMA / 3/20/2026

💬 OpinionTools & Practical UsageModels & Research

Key Points

  • A Reddit discussion seeks stable parameters for Qwen3.5, including quants, inference engines, and practical configurations.
  • The post shares a concrete parameter set for Qwen3.5-35B (A3B) based on Unsloth's recommendations, including temp, top-p, top-k, min-p, presence-penalty, repeat-penalty, and a reasoning-budget with a custom message.
  • It lists the use case as non-coding, general chat, the quant link, and the inference engine llama.cpp v8400.
  • The author reports that the model still thinks too much and is reluctant to use it unless a task requires heavy reasoning.
  • The thread invites others to propose better parameter settings and references the original discussion link.

Qwen3.5 has been out for a few weeks now. I hope the dust has settled a bit and we have stable quants, inference engines and parameters now.. ?

Please share what parameters you are using, for what use case and how well its working for you (along with quant and inference engine). This seems to be the best way to discover the best setup.

Here's mine - based on Unsloth's recommendations here and previous threads on this sub

For A3B-35B:

 --temp 0.7 --top-p 0.8 --top-k 20 --min-p 0.00 --presence-penalty 1.5 --repeat-penalty 1.0 --reasoning-budget 1000 --reasoning-budget-message "... reasoning budget exceeded, need to answer.\n" 

Performance: Still thinks too much.. to the point that I find myself shying away from it unless I specifically have a task that requires a lot of thinking..

I'm hoping that someone has a better parameter set that solves this problem?

submitted by /u/rm-rf-rm
[link] [comments]