With the launch of its Gemini large language model (LLM), Google has put itself back in contention with OpenAI’s GPT-4 and Meta’s open-source LLaMa-2 in the battle to inspire developer interest.

However, this did involve pretending on launch day that a heavily edited, ‘faked’ demo of the most powerful version of Gemini was a live, real-time interaction with a human, displaying advanced machine reasoning abilities. In reality, it was nothing of the kind. Google has since acknowledged this.

Meanwhile, Gemini has taken modality to a new level—the ability to interact with users and process data—by adding sound and video to text, images, and code.

Gemini: Twins become triplets  

Gemini comes in three versions: Pro, Nano, and Ultra. The latter, which was the source of the now-notorious ‘fake’ demo, is still in beta, while Pro is already incorporated into Bard, Google’s conversational AI. The second version, Nano, is the only technology of its kind that integrates directly into mobile devices, starting with Google’s own Pixel 8.

Nano is of particular interest, as it brings the prospect of new AI capabilities to the Android world. It is safe to assume that Samsung, Qualcomm, and MediaTek smartphone silicon will be optimized to run Nano.

Gemini will progressively bring new AI capabilities to Google’s four billion-strong user base spread around Google Search, Gmail, Chrome, and Android. However, Ultra’s target base of data centers and large enterprise apps may have to wait until 2025 to be able to assess Ultra’s relative capabilities, most notably against OpenAI’s GPT-5 and Meta’s LLaMa 3, which are likely to be in play by then.

How well do you really know your competitors?

Access the most comprehensive Company Profiles on the market, powered by GlobalData. Save hours of research. Gain competitive edge.

Company Profile – free sample

Thank you!

Your download email will arrive shortly

Not ready to buy yet? Download a free sample

We are confident about the unique quality of our Company Profiles. However, we want you to make the most beneficial decision for your business, so we offer a free sample that you can download by submitting the below form

By GlobalData
Visit our Privacy Policy for more information about our services, how we may use, process and share your personal data, including information of your rights in respect of your personal data and how you can unsubscribe from future marketing communications. Our services are intended for corporate subscribers and you warrant that the email address submitted is your corporate email address.

Technology on hold

This is because, despite all the noise, the technology is at a holding stage while work intensifies to amalgamate other AI techniques with LLMs to yield a step change. According to Google DeepMind’s Demis Hassabis, only limited progress can be made by building ever-larger LLMs operating on a statistical, pattern recognition basis.

It will take the kind of self-teaching, deep reinforcement techniques that enabled DeepMind’s legendary breakthroughs with AlphaGo and AlphaFold—and likely reasoning and planning and possibly knowledge representation capabilities—to be amalgamated with transformer LLM technology to achieve something like artificial general intelligence (AGI).

Over at OpenAI, after last month’s pantomime over the Altman firing and reinstatement and the rumours of a radical breakthrough with the Q* algorithm, the signals are that it will be 2025 before GPT-5 is ready to ship. It has been noted that Noam Brown, who made breakthroughs at Carnegie-Mellon in ‘AI self-play’ and reasoning in games like Diplomacy and poker, is now heading up research at OpenAI on how to make these methods truly general. The battle is on between Hassabis and Altman for dominance of a new AI-infested world economy, which could eventually make Gemini seem like an anti-climax.