Google, in a surprising move last Wednesday, unveiled its most advanced artificial intelligence model to date, named Gemini, this release came amidst widespread speculation about the company postponing its launch until the following year, Google confirmed that Gemini has successfully outperformed OpenAI’s GPT3.5 model and is a strong competitor to the latest GPT-4.
What is Google Gemini?
Gemini is Google’s latest and most powerful AI model, capable of understanding not just texts but also images, videos, and audio, as a multimodal model, Gemini excels in complex tasks in mathematics, physics, and other fields, as well as understanding and generating high-quality programming code in various languages.
Google’s official blog post states that Gemini is designed as a multimodal model, surpassing current AI models that typically handle only one type of user prompt, such as images or text exclusively, Gemini is capable of dealing with multiple types of inputs including text, images, audio, videos, and coding in different languages, the goal behind developing Gemini is to create an AI that can accurately solve problems, offer advice, and answer questions across various fields, from everyday matters to scientific domains.
How Did Google Develop Gemini?
Google describes the Gemini model as flexible, able to function everywhere from Google data centers to smartphones, to achieve this scalability, the company has introduced it in three versions with varying capabilities: Nano, Pro, and Ultra.
Gemini Nano Model: The Gemini Nano model was designed for use in smartphones, with the Google Pixel 8 phones being the first to feature this new model, it’s engineered for tasks requiring quick AI processing within the phone itself without needing external server connections, such as suggesting replies in chat applications or summarizing texts.
The Gemini Nano model relies on Google’s latest Tensor G3 processor chip, this model supports many features launched by Google in the Pixel phones last October, such as the ‘Summarize in Recorder’ feature, which helps summarize recorded audio clips in the Recorder app, and the creation of smart replies when using Google’s Gboard keyboard app, this feature will initially be available in WhatsApp, with plans to extend it to more messaging applications by 2024.
Notably, the Gemini Nano model’s reliance on the neural processing unit within the Tensor G3 chip will ensure the privacy of Pixel phone users’ data, as it will be processed locally on their devices, without leaving any information on Google’s servers, this also ensures the speedy performance of AI features without the need for an internet connection.
By 2024, Google Assistant on Pixel phones will incorporate the advanced capabilities of the Bard robot, but this will be exclusive to Google Pixel phones.
Gemini Pro Model: Google developed the Gemini Pro model to operate in its data centers, powering the latest version of its Bard robot, it’s designed to support advanced capabilities in text analysis and generation, coding, and planning, as well as handling various input forms such as texts, images, videos, and audio simultaneously.
Google’s official blog stated that the Gemini Pro model would initially assist Bard in quickly processing textual requests, the model will be rolled out to Bard in two phases:
- The first phase will begin with a specially modified version of Gemini Pro in English, available in 170 countries worldwide, with further updates to cover more countries and support more natural languages in the near future.
- The second phase will start early next year with the launch of Bard Advanced, the most sophisticated version of Bard, which will initially rely on the Gemini Ultra model, the most advanced of the three Gemini versions.
The Gemini Pro model outperformed the GPT3.5 model in 6 out of 8 tests conducted by Google before unveiling its new model, this includes excelling in the MMLU test, a leading standard for measuring the capability of large linguistic AI models to perform multiple text analysis tasks simultaneously, additionally, the model excelled in the GSM8K standard, which tests the ability of intelligent models to handle mathematical equations.