AI Navigate

GATED_DELTA_NET for vulkan merged in llama.cpp

Reddit r/LocalLLaMA / 3/13/2026

📰 NewsDeveloper Stack & InfrastructureTools & Practical UsageModels & Research

Key Points

  • GATED_DELTA_NET for Vulkan has been merged into llama.cpp and is available in the latest release via PR 20334.
  • On AMD RX7800XT systems running Fedora Linux, the change yields a notable performance boost, increasing Qwen 3.5 27B token generation speed from about 28t/s to about 36t/s.
  • The improvement comes from the gated delta-net for Vulkan implementation integrated into llama.cpp.
  • This update can enhance Vulkan-based inference workloads for users deploying Llama-derived models, potentially improving throughput and efficiency.

https://github.com/ggml-org/llama.cpp/pull/20334
It would be already in the latest release.

There is a performance boost in my AMD RX7800XT setup (Fedora Linux).
For Qwen 3.5 27B, token generation was ~28t/s.
It is now ~36t/s.

submitted by /u/FancyImagination880
[link] [comments]