Should you shut off thinking when you are coding on say Qwen3.6 35B

Reddit r/LocalLLaMA / 4/18/2026

💬 OpinionIdeas & Deep AnalysisTools & Practical Usage

Key Points

  • The post discusses whether enabling “thinking” during coding with a Qwen3.6 35B model meaningfully improves outcomes or instead just slows the system down.
  • It argues that “thinking” can resemble a task list/workflow (like what tools such as Claude Code or Codex do) and suggests it may be best when supported by an AI harness rather than relying only on the model.
  • The author notes they want to experiment with disabling thinking, but they cannot find a way to turn it off in LM Studio on their Mac for this specific model.
  • Overall, the piece is framed as a personal/community exploration of coding-agent behavior and how different UIs/settings affect latency and workflow.
  • It implicitly raises the practical question of how to control inference behavior (thinking vs. direct responses) in local LLM tooling for software development use cases.

Some people say that the thinking slows the system down for no real reason.

Thinking to me seems like a “to do” list kind of what Claude Code or Codex does. Maybe thinking is better with the AI in a harness that creates this to do list and doesn’t rely solely on the model.

And if I want to play with this, i can’t find a way to shut of thinking on LM Studio for this model on my Mac.

submitted by /u/KarezzaReporter
[link] [comments]