Gemma 4: first LLM to 100% my multi lingual tool calling tests

Reddit r/LocalLLaMA / 4/3/2026

💬 OpinionTools & Practical UsageModels & Research

Key Points

  • A Reddit user reports Gemma 4 as the first LLM they tested to achieve a 100% success rate in tool-calling across English, German, and Japanese in their self-hosted setup.
  • The user’s voice assistant, built with n8n and backend tools (e.g., websearch and MQTT integrations), dynamically changes language-specific context, prompts, and tool descriptions based on wake-word detection.
  • They run multi-language MoE models on a dual-3090 + 3080 system with 68GB VRAM, prioritizing lower latency for real-time assistant behavior.
  • Prior models and variants they tried (including several MoE sizes and other model families) reportedly did not reach the same consistency in tool calling for all three languages.
  • The user highlights Gemma 4 (26B, 26BA4B) as the specific configuration that matched their target reliability in their testing methodology.

I have been self hosting LLMs since before llama 3 was a thing and Gemma 4 is the first model that actually has a 100% success rate in my tool calling tests.

My main use for LLMs is a custom built voice assistant powered by N8N with custom tools like websearch, custom MQTT tools etc in the backend. The big thing is my household is multi lingual we use English, German and Japanese. Based on the wake word used the context, prompt and tool descriptions change to said language.

My set up has 68 GB of VRAM (double 3090 + 20GB 3080) and I mainly use moe models to minimize latency, I previously have been using everything from the 30B MOEs, Qwen Next, GPTOSS to GLM AIR and so far the only model which had a 100% success rate across all three languages in tool calling is Gemma4 26BA4B.

submitted by /u/MaruluVR
[link] [comments]