AI Navigate

You can run LLMs on your AMD NPU on Linux!

Reddit r/LocalLLaMA / 3/12/2026

📰 NewsDeveloper Stack & InfrastructureTools & Practical Usage

Key Points

  • Users with Ryzen AI 300/400-series PCs running Linux can now run large language models (LLMs) directly on the AMD Neural Processing Unit (NPU) for high speed, low power, and quiet on-device inference.
  • The solution goes beyond small demos, enabling real local inference workloads using AMD NPU hardware acceleration.
  • The software stack includes a Linux 7.0+ kernel NPU driver, AMD IRON compiler for XDNA NPUs, the FastFlowLM (FLM) runtime optimized for AMD NPUs, and the Lemonade Server for lightweight local model serving.
  • Interested users can access detailed guides and GitHub repositories for the Lemonade Server and FastFlowLM projects and are encouraged to join community discussion on Discord.
  • This development offers a practical path to leveraging AMD NPUs for AI inference workloads locally on Linux machines, opening opportunities for developers and businesses to build AI applications efficiently on AMD hardware.
You can run LLMs on your AMD NPU on Linux!

If you have a Ryzen™ AI 300/400-series PC and run Linux, we have good news!

You can now run LLMs directly on the AMD NPU in Linux at high speed, very low power, and quietly on-device.

Not just small demos, but real local inference.

Get Started

🍋 Lemonade Server

Lightweight Local server for running models on the AMD NPU.

Guide: https://lemonade-server.ai/flm_npu_linux.html
GitHub: https://github.com/lemonade-sdk/lemonade

⚡ FastFlowLM (FLM)

Lightweight runtime optimized for AMD NPUs.

GitHub:
https://github.com/FastFlowLM/FastFlowLM

This stack brings together:

  • Upstream NPU driver in the Linux 7.0+ kernel (with backports for 6.xx kernels)
  • AMD IRON compiler for XDNA NPUs
  • FLM runtime
  • Lemonade Server 🍋

We'd love for you to try it and let us know what you build with it on 🍋Discord: https://discord.gg/5xXzkMu8Zk

submitted by /u/BandEnvironmental834
[link] [comments]