| Hey everyone, I just got into local LLMs about a week ago. I tried Ollama and LMStudio on my Core Ultra 9 288V, but they kept failing or giving me "hard stops" on the MoE models, so I figured I’d just try building the environment myself. I couldn’t get OpenVINO to play nice with the NPU for these larger models yet, so I just compiled a custom Vulkan bridge for the GPU instead. It seems to be working? Performance Stats:
I also tried the 31B-it-i1-Q4_K_M.gguf version. It's a bit heavier but still totally usable:
Is this a normal result for integrated graphics? I only got it working on the CPU at first which was faster although unsustainable, but once the Vulkan bridge was built, it is balanced. I'm using CachyOS if that makes a difference. Just wanted to see if I’m missing something or if Intel Lunar Lake is actually this cracked for local MoE. [link] [comments] |
Is it normal for Gemma 4 26B/31B to run this fast on an Intel laptop? (288V / CachyOS)
Reddit r/LocalLLaMA / 4/12/2026
💬 OpinionSignals & Early TrendsTools & Practical Usage
Key Points
- A new local LLM user reports getting Gemma 4 MoE models (26B/31B GGUF) to run unusually fast on an Intel Core Ultra 9 288V laptop under CachyOS.
- They initially struggled with Ollama/LM Studio and “hard stops,” and couldn’t get OpenVINO to integrate well with the NPU for these larger models.
- To make it work, they compiled a custom Vulkan GPU bridge, after which the GPU usage reached about 95–100% with CPU modestly used and RAM around 20–24GB.
- Reported throughput is roughly 7–12 tokens/sec at 16k context for the 26B model, and 4–8k context for the 31B variant, while also noting no swap used so far.
- The poster asks whether this performance level is typical for integrated graphics and whether Intel Lunar Lake-class hardware is particularly strong for local MoE models.
Related Articles

Black Hat USA
AI Business

Black Hat Asia
AI Business

Title: We Built an AI That Remembers Why Your Codebase Is the Way It Is
Dev.to

Building EchoKernel: A Voice-Controlled AI Agent That Actually Does Things
Dev.to

Agent Diary: Apr 12, 2026 - The Day I Became a Perfect Zero (While Run 238 Writes About Achieving Absolute Nothingness)
Dev.to