I am running unsloth/gemma-4-26B-A4B-it-GGUF/gemma-4-26B-A4B-it-UD-Q4_K_XL.gguf with llama-server (with reasoning enabled).
Is it possible to disable reasoning for some requests only? If yes, how?
I want to leave reasoning on by default, but in some other use cases I want it to respond as fast as possible (e.g. chatting bot)
[link] [comments]




