| It solved an issue with a script that pulls real-time data from NVIDIA SMI; Gemini 3.1 actually failed to fix it even in a fresh session, lol. It’s kind of mind-blowing how in 2026 we already have stable local models with 200k+ context! I tested it out by feeding it as many Reddit posts, random documentation files, and raw files from the llama.cpp repo as possible to bump the usage up and see how it affects my VRAM. Even during this testing, Gemma kept its mind intact! At 245,283 / 262,144 (94%) context, if I ask it what a specific user said, it matches perfectly and answers within 2–5 seconds. 245283/262144 (94%) at this contex , if i ask it to tell me what this user said and perfectly matches it and tells me , within 2-5 seconds From previous tests, I found I had to decrease the temperature and bump the repeat penalty to 1.17/1.18 so it doesn't fall into a loop of self-questioning. Above 100k context, it used to start looping through its own thoughts and arguing; instead of providing a final answer, it would just go on forever. These settings helped a lot! I'm using the latest llama.cpp (which gets updates almost every hour) and the latest Unsloth GGUF from 2–6 hours ago, so make sure to redownload! Model : gemma-4-26B-A4B-it-UD-IQ4_NL.gguf , unsloth (unsloth bis) What else i can test ? honestly i ran out of ideas to crash it! It just gulps and gulps whatever i throw at it [link] [comments] |
Gemma 4 26B A4B is still fully capable at 245283/262144 (94%) contex !
Reddit r/LocalLLaMA / 4/11/2026
💬 OpinionSignals & Early TrendsTools & Practical UsageModels & Research
Key Points
- Redditユーザーの検証では、Gemma 4 26B A4B(GGUF)を最大262,144トークン級のコンテキストの約94%(245,283/262,144)まで投入しても、特定ユーザー発言の照合・回答が高精度で保たれたと報告されています。
- 長大コンテキストでの自己言及ループ(自己疑問の掘り下げ・議論の無限化)を抑えるため、温度を下げ、repeat penaltyを1.17/1.18付近に調整すると改善したとのことです。
- NVIDIA SMIからリアルタイムデータを取得するスクリプトに関する問題について、Gemini 3.1では解決しなかったがGemma 4側では解消できた、という具体的な作業上の改善例も挙げられています。
- 実験ではllama.cppの最新版(更新頻度が高い)とUnslothの最新版GGUFを使用しており、モデルやビルド更新の再DLが必要だと注意喚起しています。
Related Articles

Black Hat USA
AI Business

Black Hat Asia
AI Business

Why Cursor Keeps Generating Wildcard CORS -- And How to Fix It
Dev.to

Model Context Protocol (MCP): The USB-C Standard for AI Agents — Opportunities for Decentralized AI
Dev.to

What if browsers were designed for AI, not humans? (My first open source project — feedback welcome)
Dev.to