Qwen 3.6: worse adherence?

Reddit r/LocalLLaMA / 4/17/2026

💬 OpinionDeveloper Stack & InfrastructureSignals & Early TrendsTools & Practical UsageModels & Research

共有:

Key Points

A user reports that switching from Qwen 3.5 to the Qwen 3.6 FP8 variant in a vLLM + Open WebUI RAG setup led to worse instruction adherence and different tool behavior.
The model appears significantly more “talkative” with tools, with tool-related reasoning tokens increasing from dozens to several hundred (roughly 2–3x).
The user observes the model following specific instructions less reliably and seemingly weighting or honoring the system prompt less than before.
Even when prompted to produce exhaustive answers, the final responses are noticeably shorter, suggesting changes in output formatting or generation dynamics.
The user suspects the chat template or how vLLM handles the newer weights may be responsible and asks whether others are seeing similar regressions after only swapping models.

Just swapped Qwen 3.5 for the 3.6 variant (FP8, RTX 6000 Pro) using the same recommended generation settings. My stack is vLLM (v0.19.0) + Open WebUI (v0.8.12) in a RAG setup where the model has access to several document retrieval tools.

After some initial testing (single-turn, didnt try to disable interleaved reasoning yet), I’ve noticed some significant shifts:

- 3.6 is far more "talkative" with tools. Reasoning tokens have jumped from a few dozen to several hundred (a 2x-3x increase).

- It struggles to follow specific instructions compared to 3.5.

- It seems to ignore or weight the system prompt much less.

- Despite being prompted for exhaustive answers, the final responses are significantly shorter.

I suspect a potential issue with the chat template or how vLLM handles the new weights, even though the architecture is the same. Anyone else seeing similar problems?

EDIT:

- I swapped Qwen3.5-35B-A3B and Qwen3.6-35B-A3B, nothing else.

- What worked before do not work that well anymore.

- The extra reasoning is significant WITH TOOLS.

submitted by /u/tkon3
[link] [comments]

Black Hat USA

AI Business

Black Hat Asia

AI Business

The AI Hype Cycle Is Lying to You About What to Learn

Dev.to

Big Tech firms are accelerating AI investments and integration, while regulators and companies focus on safety and responsible adoption.

Dev.to

OpenAI Codex April 2026 Update Review: Computer Use, Memory & 90+ Plugins — Is the Hype Real?

Dev.to

Qwen 3.6: worse adherence?

Key Points

Related Articles

Black Hat USA

Black Hat Asia

The AI Hype Cycle Is Lying to You About What to Learn

Big Tech firms are accelerating AI investments and integration, while regulators and companies focus on safety and responsible adoption.

OpenAI Codex April 2026 Update Review: Computer Use, Memory & 90+ Plugins — Is the Hype Real?

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer