Just under a year after the launch of Gemini 1.5, Google’s DeepMind division returned on Wednesday to unveil its next-generation AI model, Gemini 2.0. This updated version brings with it the capability for generating images and audio, promising to help develop new AI agents that align with the company’s goal of creating a universal assistant, according to their post on the official blog.
Starting Wednesday, Gemini 2.0 is accessible across all subscription levels, including a free tier. As Google’s premier AI model, it will gradually be integrated into various AI functionalities within the company’s ecosystem. However, similar to OpenAI’s earlier o1 model, the initial version of Gemini 2.0 is classified as an “experimental preview,” set to evolve into a more robust version over the next few months.
“In a way,” shared Google DeepMind CEO Demis Hassabis with The Verge, “it matches the capabilities of the current Pro model. You can view it as an entire tier ahead, with equal cost efficiency, performance, and speed. We’re thrilled with the result.”
Additionally, Google is introducing a streamlined version of this model, called Gemini 2.0 Flash, aimed specifically at developers.
This latest iteration marks a significant step forward in Google’s ambitions for AI integration, where smaller, specialized models will perform tasks autonomously on behalf of users. Gemini 2.0 is anticipated to greatly accelerate Google’s Project Astra, which seeks to merge Gemini Live’s conversational abilities with real-time video and image analysis, allowing users to receive insights about their environment through smart glasses.
On the same day, Google also presented Project Mariner. This Chrome extension emulates human-like interaction with desktop computers, managing tasks such as keystrokes and mouse clicks. Furthermore, the introduction of an AI coding assistant known as Jules aims to assist developers in optimizing their code, while a new feature called “Deep Research” allows users to generate comprehensive reports based on online searches.
Deep Research, which resembles the functionality of Perplexity AI and ChatGPT Search, is currently in use by Gemini Advanced subscribers who communicate in English. The system starts by drafting a multi-step research plan, which is presented to the user for approval prior to execution.
After receiving the go-ahead, the research agent will probe into the specified topic and explore related avenues. Upon completion of its investigation, it delivers a report detailing its findings, complete with key insights and citation links. Users can choose this option from the model selection menu at the top of the Gemini homepage.