AI Navigate

Anybody using LMStudio on an AMD Strix 395 AI Max (128GB unified memory)? I keep on getting errors and it always loads to RAM.

Reddit r/LocalLLaMA / 3/22/2026

💬 OpinionTools & Practical Usage

Key Points

  • A user reports LMStudio on an AMD Strix 395 AI Max with 128GB unified RAM loads models entirely into RAM rather than GPU, causing failures.
  • They describe symptoms such as a 70GB Qwen3 model loading into RAM and then failing to transfer to GPU, and input fails when trying to chat.
  • They say they have the latest LMStudio and llama.cpp, and have configured GPU max layers and VRAM (BIOS 96GB or auto) without success.
  • They are asking for guidance or a tutorial on resolving the issue.

Hey all,

I have a Framework AI Max+ AMD 395 Strix system, the one with 128GB of unified RAM that can have a huge chunk dedicated towards its GPU.

I'm trying to use LMStudio but I can't get it to work at all and I feel as if it is user error. My issue is two-fold. First, all models appear to load into RAM. For example, a Qwen3 model that is 70GB will load into RAM and then try to load to GPU and fail. If I type something into the chat, it fails. I can't seem to get it to stop loading the model into RAM despite setting the GPU as the llama.cpp.

I have the latest LMStudio, and the latest llama.cpp main branch that is included with LMStudio. I also set GPU max layers for the model. I have set 96GB vram in the bios, but also set it to auto.

Nothing works.

Is there something I am missing here or a tutorial or something you could point me to?

Thanks!

submitted by /u/StartupTim
[link] [comments]