I have RX 7900 XTX, running Qwen3.6 27B Q4_K_XL. got 400ish pp and 30s tps. every work below 64k is incredible and it spits out good quality code.
But i tried to push it further to work on kinda complex devops related work and it fail at tool calling at 90k ctx.
I use opencode as my harness and here is the llama.cpp command i ran:
Ilama-server -ctv q8_0 -ctk q8_0 -c 128000 --temp 0.6 --top-p 0.95 --top-k 20 --repeat-penalty 1.0 --fit on.
what's your experience?
[link] [comments]



