The definitive Qwen 3.5 Jinja template

Reddit r/LocalLLaMA / 4/12/2026

💬 OpinionDeveloper Stack & InfrastructureTools & Practical UsageModels & Research

共有:

Key Points

A deep-dive author presents a “definitive” Jinja chat template for Qwen 3.5, aiming to resolve lingering tool-calling and XML formatting bugs seen in the official template.
The template is designed to match Qwen’s native `<think>` XML schema (avoiding older forced/pseudo-comment syntaxes) and to map modern `developer` role strings expected by current API clients.
It adds safety handling such as caching empty tool parameters and targeted fixes for LM Studio-specific backend quirks (including iterator parsing and issues when tool-call text appears inside thoughts).
As a workaround for an infinite loop tool bug, it can scrub `<|think_off|>` tags from system or user prompts to hard-disable “thinking” for that turn.
The template is shared via a Hugging Face link and is suggested to be potentially compatible with upcoming Qwen 3.6 models, pending user feedback.

I’ve been doing a pretty thorough deep dive into the Qwen 3.5 templating logic to properly fix the lingering tool calling bugs. People here have done some really brilliant groundwork, templates from folks like @pneuny and @ellary were absolute lifesavers early on. But I realised that a lot of them rely on forced prompt injections, or accidentally hallucinate the xml formatting (qwen is actually trained on pure <think> tags natively, not the /* syntax some older templates fallback to).

So after many hours of resarching and testing all the known problems with the official qwen template, I carefully wrote the best possible template. It perfectly respects the native xml schema, dynamically maps the newer 'developer' role strings from modern api clients, and safely caches empty tool parameters.

Just as a side note for anyone specifically using LM Studio: the backend throws an error over python |items dict iterators, and the regex parser completely borks if the model just ponders about a tool call inside its thoughts. I’ve integrated targeted fixes for this into the jinja too. If you write <|think_off|> anywhere inside your prompt (both system or user), the template invisibly scrubs the tag and hard-disables thinking for that turn, completely bypassing the infinite loop tool bug.

Im hoping the architecture here is solid enough that it should still be valid for the soon to be released Qwen 3.6 models. Let me know if you run into any weird behaviour.

You can get the template from here:

https://huggingface.co/froggeric/Qwen3.5-35B-A3B-Uncensored-FernflowerAI-MLX-8bit/blob/main/chat_template.jinja

submitted by /u/ex-arman68
[link] [comments]