llama.cpp fixes to run Bonsai 1-bit models on CPU (incl AVX512) and AMD GPUs

Reddit r/LocalLLaMA / 4/3/2026

💬 OpinionDeveloper Stack & InfrastructureSignals & Early TrendsTools & Practical Usage

Key Points

  • A PrismAI fork of llama.cpp is reported to fix issues that prevented CPU execution, including for Bonsai 1-bit models.
  • The update includes support/enhancements for CPU performance using AVX512 instructions.
  • It also provides guidance for running the same class of models on AMD GPUs using ROCm.

PrismAI's fork of llama.cpp is broken if you try to run on CPU. This also includes instructions for running on AMD GPUs via ROCm.

https://github.com/philtomson/llama.cpp/tree/prism

submitted by /u/UncleOxidant
[link] [comments]