PSA: Having issues with Qwen3.5 overthinking? Give it a tool, and it can help dramatically.

Reddit r/LocalLLaMA / 4/14/2026

💬 OpinionDeveloper Stack & InfrastructureTools & Practical Usage

Key Points

  • The post addresses a common issue with Qwen 3.5 “overthinking” and proposes practical configuration changes to reduce it.
  • It recommends verifying sampling parameters—especially the presence_penalty—by setting presence_penalty to roughly 1.0–1.5 (with some experimentation).
  • The key workaround is enabling tools/function calling: when tools are available, Qwen 3.5 shifts from a long “reasoning trace” to a shorter, more natural response style.
  • The author reports testing via llama-server in Open-WebUI (ensuring “native” function calling is enabled) and notes that other tool-enabled harnesses should already avoid the problem.
  • The TL;DR is to enable tools (even if not actually used) and follow the recommended sampling guidance to mitigate overthinking.

I'm sure everyone has seen the posts from people talking about Qwen 3.5 over-thinking, or maybe you've experienced it yourself. Considering we're like 2 months out from the release and I still see people talk about this issue, I decided it might be a good idea to put this thread out there.

First, the obvious - make sure your sampling parameters are set correctly. This is the first part of the "fix" and relates to the presence_penalty value. Set this to 1.0-1.5. Experiment a little if you're willing. This is something most of you here likely already know, too. So let's get to the "real" fix.

When Qwen 3.5 has no tools available, it engages in a Gemini 3/Gemma 4-like reasoning trace. This is the nice, bullet list style as seen here.

This is relevant because when you enable tools for 3.5, it completely changes the style of reasoning and instead it engages in a short, more natural Claude-like trace as shown here. If you've used Claude, you probably immediately recognise this style. For context, this is with the model running via llama-server inside Open-WebUI. All I did was enable the built-in tools it comes with.(Note if using OWI: make sure you enable "native" function calling.) This isn't only applicable to OWI, though. If using a harness that already has tools like OpenCode or Hermes Agent, you shouldn't have any overthinking problems in the first place.

But yeah, that's essentially all there is to it. So, if you're running the model with no tools, I'd strongly recommend adding some. Apparently even just telling it that it has fake tools works too, but I haven't tried this myself.

I hope this helps anybody who has been dealing with this. :)

TL;DR: Enable a tool even if you aren't using it, and make sure you've got your sampling params set according to Unsloths guide.

submitted by /u/ayylmaonade
[link] [comments]