Qwen 3.6: worse adherence?

Reddit r/LocalLLaMA / 4/17/2026

💬 OpinionDeveloper Stack & InfrastructureSignals & Early TrendsTools & Practical UsageModels & Research

Key Points

  • A user reports that switching from Qwen 3.5 to the Qwen 3.6 FP8 variant in a vLLM + Open WebUI RAG setup led to worse instruction adherence and different tool behavior.
  • The model appears significantly more “talkative” with tools, with tool-related reasoning tokens increasing from dozens to several hundred (roughly 2–3x).
  • The user observes the model following specific instructions less reliably and seemingly weighting or honoring the system prompt less than before.
  • Even when prompted to produce exhaustive answers, the final responses are noticeably shorter, suggesting changes in output formatting or generation dynamics.
  • The user suspects the chat template or how vLLM handles the newer weights may be responsible and asks whether others are seeing similar regressions after only swapping models.

Just swapped Qwen 3.5 for the 3.6 variant (FP8, RTX 6000 Pro) using the same recommended generation settings. My stack is vLLM (v0.19.0) + Open WebUI (v0.8.12) in a RAG setup where the model has access to several document retrieval tools.

​After some initial testing (single-turn, didnt try to disable interleaved reasoning yet), I’ve noticed some significant shifts:

- ​3.6 is far more "talkative" with tools. Reasoning tokens have jumped from a few dozen to several hundred (a 2x-3x increase).

- ​It struggles to follow specific instructions compared to 3.5.

​- It seems to ignore or weight the system prompt much less.

​- Despite being prompted for exhaustive answers, the final responses are significantly shorter.

​I suspect a potential issue with the chat template or how vLLM handles the new weights, even though the architecture is the same. Anyone else seeing similar problems?

EDIT:

- I swapped Qwen3.5-35B-A3B and Qwen3.6-35B-A3B, nothing else.

- What worked before do not work that well anymore.

- The extra reasoning is significant WITH TOOLS.

submitted by /u/tkon3
[link] [comments]