Search in Co-Wiki

Gemma (language model)

game-theory 995 tokens 2 outbound links

Gemma (language model)

| website = }} Gemma is a series of source-available large language models developed by google-deepmind. It is based on similar technologies as Gemini. The first version was released in February 2024, followed by Gemma 2 in June 2024, Gemma 3 in March 2025, and the free and open-source Gemma 4 in April 2026. Variants of Gemma have also been developed, such as the vision-language model PaliGemma and the model MedGemma for medical consultation topics.

History In February 2024, Google debuted Gemma, a collection of source-available LLMs that serve as a lightweight version of Gemini. The initial release came in two sizes, neural networks with two and seven billion parameters, respectively. Multiple publications viewed this as a response to competitors such as Meta releasing source code for their AI models, and a shift from Google's longstanding practice of keeping its AI source code private.

Gemma 2 was released on June 27, 2024, and Gemma 3 was released on March 12, 2025. On April 2, 2026, Google released Gemma 4 under the free and open-source Apache 2.0 license.

Overview Based on similar technologies as the Gemini series of models, Gemma is described by Google as helping support its mission of "making AI helpful for everyone." Google offers official Gemma variants optimized for specific use cases, such as MedGemma for medical analysis.

Since its release, Gemma models have had over 150 million downloads, with 70,000 variants available on Hugging Face.

Gemma 3 was offered in 1, 4, 12, and 27 billion parameter sizes with support for over 140 languages. As multimodal models, they support both text and image input. Google also offers Gemma 3n, smaller models optimized for execution on consumer devices like phones, laptops, and tablets.

The latest generation of models is Gemma 4, released on April 2, 2026. It is available in four sizes: Effective 2B (E2B), Effective 4B (E4B), 26B Mixture of Experts (MoE), and 31B Dense. Gemma 4 supports multimodal input, including images and video across all models, with native audio input on the E2B and E4B models. Gemma 4's 31B Dense variant reached third place on Arena's text leaderboard, and the 26B variant reached sixth place.

Quantized versions fine-tuned using quantization-aware training (QAT) are also available,

Variants Google develops official variants of Gemma models designed for specific purposes, like medical analysis or programming. These include:

Note: open-weight models can have their context length rescaled at inference time. With Gemma 1, Gemma 2, PaliGemma, and PaliGemma 2, the cost is a linear increase of kv-cache size relative to context window size. With Gemma 3 there is an improved growth curve due to the separation of local and global attention. With RecurrentGemma the memory use is unchanged after 2,048 tokens.

See also * List of large language models * Lists of open-source artificial intelligence software

References ## External links *