qwen 3.6 27B looping problem

Reddit r/LocalLLaMA / 5/5/2026

💬 OpinionDeveloper Stack & InfrastructureSignals & Early TrendsTools & Practical Usage

Key Points

  • A Reddit user reports that Qwen 3.6 27B (used on a Pi with a high quantization) works for coding/testing, but begins looping after exceeding about 100k context tokens.
  • The user tried multiple ways to interrupt or restart the model (e.g., telling it to start over), yet the looping persisted.
  • They share a specific llama-server invocation with very large context settings (e.g., -c 200000) and various runtime parameters (keep, batch, checkpointing, ngram speculation), suggesting the issue may be triggered by long-context inference.
  • The post asks the community for solutions or mitigations to prevent long-context looping in Qwen 3.6 27B.
  • The reported behavior contrasts with Gemma 31B, which the user says does not show the same looping problem under similar usage.
qwen 3.6 27B looping problem

Whenever I write here that I use gemma 31B I get answers that qwen 27B is better. I switched in the pi from gemma 31B Q5 to qwen 27B Q8 and generally I manage to code, document and run tests but somewhere after exceeding 100k context qwen keeps getting into loops. Do you have any solution for this?

https://preview.redd.it/o4e1vxkc29zg1.png?width=2575&format=png&auto=webp&s=c6f93e53127b5c8ba798f1c7b503a06172425a0a

https://preview.redd.it/8qriwlrd29zg1.png?width=2747&format=png&auto=webp&s=082cf04774aa7ae77044ff04d5962a2f0606f73a

https://preview.redd.it/xz9lsdde29zg1.png?width=2447&format=png&auto=webp&s=81e4d88a1a0347fc9f6ef743ef612db47557c7b5

I tried to break it and tell him to start over, try again, etc... but it keeps looping

my current command is:

CUDA_VISIBLE_DEVICES=0,1,2 llama-server -c 200000 -m /mnt/models2/Qwen/3.6/Qwen3.6-27B-UD-Q8_K_XL.gguf --host 0.0.0.0 --jinja -fa on --keep 4096 -b 8192 --spec-type ngram-mod --parallel 1 --ctx-checkpoints 24 --checkpoint-every-n-tokens 8192 --cache-ram 65536

submitted by /u/jacek2023
[link] [comments]