Gemma 4 1B, 13B, and 27B spotted

Reddit r/LocalLLaMA / 4/3/2026

💬 OpinionSignals & Early TrendsModels & Research

Key Points

  • Gemma 4 is a multimodal model with pretrained and instruction-tuned variants released in three sizes: 1B, 13B, and 27B parameters.
  • The model’s core architecture is largely consistent with prior Gemma versions, but it adds a dedicated vision processor capable of generating images within a fixed token budget.
  • For vision understanding, Gemma 4 introduces a spatial 2D RoPE mechanism to encode information across both height and width dimensions.
  • Original Gemma 4 checkpoints for all variants are available via a Hugging Face collection link.

[Gemma 4](INSET_PAPER_LINK) is a multimodal model with pretrained and instruction-tuned variants, available in 1B, 13B, and 27B parameters. The architecture is mostly the same as the previous Gemma versions. The key differences are a vision processor that can output images of fixed token budget and a spatial 2D RoPE to encode vision-specific information across height and width axis.

You can find all the original Gemma 4 checkpoints under the [Gemma 4](https://huggingface.co/collections/google/gemma-4-release-67c6c6f89c4f76621268bb6d) release.

submitted by /u/TKGaming_11
[link] [comments]