https://github.com/ggml-org/llama.cpp/pull/20334
It would be already in the latest release.
There is a performance boost in my AMD RX7800XT setup (Fedora Linux).
For Qwen 3.5 27B, token generation was ~28t/s.
It is now ~36t/s.
[link] [comments]
Reddit r/LocalLLaMA / 3/13/2026
https://github.com/ggml-org/llama.cpp/pull/20334
It would be already in the latest release.
There is a performance boost in my AMD RX7800XT setup (Fedora Linux).
For Qwen 3.5 27B, token generation was ~28t/s.
It is now ~36t/s.

Ledge.ai

Ledge.ai

note

note

note