Qwen 3.6 27B in Claude Code says it will do something then stops and prompts for user reply (not failing a tool call)

Reddit r/LocalLLaMA / 4/27/2026

💬 OpinionDeveloper Stack & InfrastructureSignals & Early TrendsTools & Practical Usage

Key Points

  • A user running Qwen/Qwen3.6-27B-FP8 on vLLM reports that, in Claude Code, the model often announces it will do something and then stops to wait for user input.
  • The user observes cases where the system produces no error and no failed tool-call signal, suggesting the tool execution is not explicitly failing, even though the expected follow-through doesn’t happen.
  • Sometimes the model repeats the behavior multiple times and even claims it detected a user intent like “continue,” yet still waits for another user reply.
  • The poster is asking whether this behavior stems from a model limitation, a mismatch between Claude Code prompting/tooling and the model (tool-call parsing), or a configuration issue in vLLM.
  • They note the same issue is less common in OpenCode and ask for diagnosis or guidance on making Claude Code work reliably with this model setup.

I'm running Qwen/Qwen3.6-27B-FP8 via vLLM using this command: vllm serve Qwen/Qwen3.6-27B-FP8 --tensor-parallel-size 4 --gpu-memory-utilization 0.95 --max-num-seqs 8 \ --enable-auto-tool-choice --tool-call-parser qwen3_xml \ --enable-prefix-caching --attention-backend flashinfer

It works pretty well in Claude Code, except fairly often it will announce its about to do something, then just stops and waits for a user response. E.g.:

``` Let me continue with the remaining edits.

✻ Brewed for 48s

```

(waiting for user input)

No error message, no failed tool call as far as I can tell, it just fails to follow through. Sometimes it will do it several times in a row and even comment "The user replied 'continue' - they want me to continue. Let me continue with the remaining edits." (user prompt waiting for me to reply)

Is this just a deficiency in the model's thinking, an incompatibility between Claude Code's prompts and the model, or an error in the configuration?

I haven't seen this happen in OpenCode, but there are reasons I prefer CC for some tasks.

Thanks.

submitted by /u/jettoblack
[link] [comments]