| Models compared:
Main flags for boths
I know they may not be the best and I still need more experiments (thank you u/Sadman782) I find these tests fun and interesting.
Please let me know if you need more tests. [link] [comments] |
Comparing Qwen3.5 27B vs Gemma 4 31B for agentic stuff
Reddit r/LocalLLaMA / 4/14/2026
💬 OpinionSignals & Early TrendsTools & Practical UsageModels & Research
Key Points
- The post compares two local LLM variants for “agentic” tasks: Qwen3.5-27B-UD-Q5_K_XL and gemma-4-31B-it-UD-Q5_K_XL, using similar runtime flags and settings.
- Both models are tested with reasoning enabled, long context configuration, flash attention, GPU-layer offloading, and image token limits, plus a multimodal projector for image handling.
- Qwen3.5 is reported to take more steps and perform checks (including environment-variable checks) and sometimes switch scripting styles (creating Python vs Bash), which can improve final task completion quality.
- Gemma 4 is described as more direct—often finding relevant URLs more effectively—but it may fail to finish the final goal, with an example where the Telegram message was truncated.
- The author emphasizes these are preliminary, fun experiments and asks for additional tests to validate which model is better for agentic workflows.


