Gemini (language model)
Gemini (language model)
{{Infobox software | logo = | author = | developer = Google AIgoogle-deepmind | released = (beta version) (official rollout) | discontinued = | latest release version = 3.1 Pro, 3 Deep Think, 3 Flash, 3.1 Flash lite it was announced on December 6, 2023. It powers the chatbot of the same name.
History ### Development Google announced Gemini, a large language model (LLM) developed by subsidiary google-deepmind, during the Google I/O keynote on May 10, 2023. It was positioned as a more powerful successor to PaLM 2, which was also unveiled at the event, with Google CEO Sundar Pichai stating that Gemini was still in its early developmental stages.
Updates In January 2024, Google partnered with Samsung to integrate Gemini Nano and Gemini Pro into its Galaxy S24 smartphone lineup.
On December 11, 2024, Google announced the new Gemini 2.0 Flash Experimental model. Features include a Multimodal Live API for real-time audio and video interactions, native image and controllable text-to-speech generation (with watermarking), and integrated Google Search. It also introduces improved agentic capabilities, a new Google Gen AI SDK, and "Jules", an experimental AI coding agent for GitHub.
On January 30, 2025, Google released Gemini 2.0 Flash as the new default model, with Gemini 1.5 Flash still available for usage. This was followed by the release of Gemini 2.0 Pro on February 5, 2025. Additionally, Google released Gemini 2.0 Flash Thinking Experimental, which generates a summary of the language model's thinking process when responding to prompts.
On March 12, 2025, Google also announced gemini-robotics, a vision-language-action model based on the Gemini 2.0 family of models. The next day, Google announced that Gemini in Android Studio would be able to understand simple UI mockups and transform them into working Jetpack Compose code.
Gemini 2.5 Pro Experimental was released on March 25, 2025. Gemini 2.5 Pro was introduced as the most advanced Gemini model, featuring better coding capabilities and other improvements. Both 2.5 Pro and Flash support native audio output.
On June 17, 2025, Google announced general availability for 2.5 Pro and Flash. They also introduced Gemini 2.5 Flash-Lite that same day, a model optimized for speed and cost-efficiency.
On November 18, 2025, Google announced the release of 3 Pro and 3 Deep Think. These new models replace 2.5 Pro and Flash. This release prompted OpenAI to hasten the release of the competing model GPT-5.2, which was released on December 11.
On December 4, 2025, Google announced that 3 Deep Think would start rolling out to Ultra subscribers.
On December 17, 2025, Google announced the release of 3 Flash replacing the current version of 2.5 Flash.
On January 12, 2026, Apple announced plans to use the Gemini AI model in the upcoming version of Siri.
On February 19, 2026, Google released Gemini 3.1 Pro.
On February 26, 2026, Nano Banana 2 was released. It is an updated version built on the Gemini 3.1 Flash Image platform.
On March 3, 2026, Google released Gemini 3.1 Flash Lite to Developers in the Google API.
Model versions The following table lists the main model versions of Gemini, describing the significant changes included with each version:
Nano Banana alt=AI-generated abstract art created with Nano Banana 2 at Gemini|thumb|AI-generated abstract art created with Nano Banana 2 at Gemini Nano Banana (officially Gemini 2.5 Flash Image), Nano Banana Pro (officially Gemini 3 Pro Image) and Nano Banana 2 (officially Gemini 3.1 Flash Image) are image generation and editing models.
"Nano Banana" was the codename used for the model while it was undergoing secret public testing on Arena. It first appeared publicly as an anonymous model on the crowd-sourced AI evaluation platform Arena on August 12, 2025. It was released publicly on August 26, 2025 through the Gemini app and related Google AI services. The nickname "Nano Banana" originated from nicknames given to Naina Raisinghani, Product Manager at Google DeepMind. Google later confirmed its identity as Gemini 2.5 Flash Image in an official announcement upon public release. On November 20, 2025, DeepMind released Nano Banana Pro (Gemini 3 Pro Image) with improved text rendering and world knowledge.
Upon release, Nano Banana became a viral Internet sensation on social media, particularly for its photorealistic "3D figurine" images. Following its release, Nano Banana was made available in the Gemini app, Google AI Studio, and through Vertex AI. According to Google, it helped attract over 10 million new users to the Gemini app and facilitated more than 200 million image edits within weeks of launch.
The model lets users change hairstyles, backdrops, and mix photos with natural language cues. Subject consistency allows the same person or item to be recognized across revisions of an image. Multi-image fusion joins photographs into one seamless output, and world knowledge allows context-aware changes. It also provides SynthID watermarking, which is an invisible digital signature in outputs to identify AI-generated information. Multi-image fusion joins photographs into one seamless output, and world knowledge allows context-aware changes. People started to connect Nano Banana with a viral craze in which people turned their selfies into 3D figurines that looked like toys. The event circulated quickly on sites like Instagram and X (previously Twitter). By adding the model to X, users could tag Nano Banana directly in posts to make photos from prompts, which made it even more popular. A review in Tom's Guide praised its ability to handle creative and lively image edits. Another review in PC Gamer mentioned that the model did not have some basic editing tools like cropping, and that the product sometimes did not apply changes, but reverted back to the original image instead.
On 26 February 2026, Nano Banana 2 was rolled out and integrated into the Gemini chatbot, Search AI Mode, and Lens. It is a faster version built on Gemini 3.1 Flash Image, with better instruction following and text rendering.
Technical specifications As Gemini is multimodal, each context window can contain multiple forms of input. The different modes can be interleaved and do not have to be presented in a fixed order, allowing for a multimodal conversation. For example, the user might open the conversation with a mix of text, picture, video, and audio, presented in any order, and Gemini might reply with the same free ordering. Input images may be of different resolutions, while video is inputted as a sequence of images. Audio is sampled at 16 kHz and then converted into a sequence of tokens by the Universal Speech Model. Gemini's dataset is multimodal and multilingual, consisting of "web documents, books, and code, and includ[ing] image, audio, and video data".
Gemini 2.5 Pro Experimental debuted at the top position on the LMArena leaderboard, a benchmark measuring human preference.
</references>