Gemma 4 - lazy model or am I crazy? (bit of a rant)

Reddit r/LocalLLaMA / 4/13/2026

💬 OpinionSignals & Early TrendsTools & Practical UsageModels & Research

Key Points

  • A Reddit user tests the “Gemma 4” 26B MoE model locally (via llama.cpp and Unslooth quantization) and finds it repeatedly answers from internal knowledge instead of using web-search tools.
  • Even when explicitly instructed to “search extensively,” “dig deep,” and to use provided search/fetch tools and skills, the model performs at most a single web search and then stops after quickly scanning snippets.
  • The user contrasts this behavior with Qwen 3.5, which they say more readily follows prompts to conduct multi-step web research (“whole quest” digging up sources).
  • They ask the community to confirm whether this is expected behavior or a local configuration issue, requesting specific quantization and runtime settings that would make Gemma 4 “capitulate” and search more.
  • Overall, the post is a hands-on complaint/diagnosis of tool-usage and agent-like behavior in a specific model configuration rather than a new product release.

Like it says in the title. Specifically, the 26b MoE.

I’ve wanted to like this model, so much. Thought it might replace Qwen 3.5 27b. Keep coming back to it and trying it every time there’s an update, hoping it will have improved.

I’m running unsloth UD_Q4_K_XL on llama.cpp. I’m on the latest commits from main. I know about —jinja. I know about the interleaved thinking template. I’m not running low quant KV cache. This is far from the first model I’ve run.

Every time, my tests show the same thing - it is a very lazy model when it comes to using skills or searching the web. If you ask it a question, it will by default answer from its own knowledge without a single web search. If you explicitly ask it for a web search, it will lower itself to performing a _single_ web search, quickly scan the snippets from the search and then internally decide “with the snippets and my own internal knowledge I have enough information to answer, I don’t need to search more”.

This even if you:

- have given it tools for search and fetch, with the search tool including a description “don’t answer from these snippets, use fetch” and the fetch tool saying “use this to fetch pages obtained from the search tool”.

- have explicitly told it “search extensively”, “dig deep”, “don’t be lazy” etc.

- have put in context a pushy skill called “searching-the-web” with explicit instructions to do all the above.

- have put in context a pushy skill instruction saying “you must use skills if you think they have even a small chance of being applicable”.

- have explicitly told it “reference the searching-the-web skill”

Qwen 3.5, you barely have to ask and it will go on a whole quest to dig things up for you. Gemma 4, you scream at it till you’re blue in the face and it can barely be arsed to perform a single search. My only conclusion is that it just _really does not want to search the web_ (for AI values of “want” of course).

If I’m crazy, tell me. If you have it working great and digging deep on the web without having to twist its proverbial arm, tell me. And please be so kind as to tell me what quant / settings you’re running to make it capitulate on this point.

submitted by /u/Pyrenaeda
[link] [comments]