Gemini Ultra

Google DeepMind Large Language Models Multimodal AI

9 min read

Updated Jun 27, 2026

Suggest edit History Talk

RawGraph

Last edited

Jun 27, 2026

Fact-checked

In review queue

Sources

16 citations

Revision

v2 · 1,755 words

Fact-checks are independent of edits: a reviewer re-verifies the article against its sources and stamps the date. How we verify

Gemini Ultra (branded Ultra 1.0) was the largest and most capable model in the Gemini 1.0 family, the first generation of natively multimodal large language models from Google DeepMind. Announced on December 6, 2023, and released to the public on February 8, 2024, Ultra was the first AI model to score above the human-expert baseline on the MMLU knowledge benchmark, reaching 90.0% versus a human-expert score of 89.8%.^[1]^[2]^[8] Demis Hassabis, the CEO of Google DeepMind, called it "the most capable and general model we've ever built," and Google positioned it as a direct rival to OpenAI's GPT-4.^[1] Unlike the smaller Gemini tiers, which shipped almost immediately, Ultra did not reach the public until February 8, 2024, when it became the engine behind the paid Gemini Advanced product, sold through a Google One AI Premium plan at $19.99 per month.^[3]^[7]

What is Gemini Ultra?

Gemini Ultra was the flagship (top) tier of the three-model Gemini 1.0 lineup that Google introduced on December 6, 2023, alongside the general-purpose Gemini Pro and the on-device Gemini Nano. It was the model Google used to make its headline claim that Gemini could outperform a human-expert baseline on a widely used knowledge benchmark. Sundar Pichai, the CEO of Google, framed the release as foundational for the company, saying "this new era of models represents one of the biggest science and engineering efforts we've undertaken as a company."^[1] Google designed Gemini "to be natively multimodal, pre-trained from the start on different modalities," meaning text, code, images, audio, and video were learned together rather than having separate modalities bolted on afterward.^[1]^[2]

Background

Google introduced the Gemini family at a virtual press briefing led by chief executive Sundar Pichai and DeepMind chief executive Demis Hassabis. The launch was the company's most prominent answer to ChatGPT, which had triggered what Pichai called a "code red" inside Google a year earlier. Gemini 1.0 was offered in three sizes tuned for different jobs: Ultra for the most demanding tasks, Pro as a general-purpose model, and Nano for running on devices such as the Pixel 8 Pro. Google described all three as "natively multimodal," meaning they were pre-trained from the start on a mixture of text, code, images, audio, and video rather than having separate modalities bolted on afterward.^[1]^[2]^[4]

At announcement, Pro was already powering Google's Bard chatbot and was being rolled out to developers. Ultra, by contrast, was held back. Google said it was still undergoing "trust and safety checks, including red-teaming by trusted external parties," and would arrive in a forthcoming product first referred to as "Bard Advanced."^[1]^[2] That gap between announcement and release became one of the first points of friction in the model's reception, because reporters were not given hands-on access or live demonstrations at the briefing.^[5]

What could Gemini Ultra do?

Google presented Ultra as its state-of-the-art model across text, image, audio, and video understanding, with particular strength in tasks that require multi-step reasoning, coding, and the interpretation of complex visual or scientific material. Because the model was trained natively across modalities, Google emphasized examples in which it reasoned jointly over images and text, extracted information from charts and documents, and worked through problems in mathematics and physics.^[1]^[2] On code generation, the technical report credited Ultra with strong results, including a 74.4% score on the HumanEval Python benchmark, and Google framed coding as one of Ultra's standout abilities.^[2]^[6]

The model was the basis for Gemini Advanced, where it was branded "Ultra 1.0." Google said this configuration was far more capable than Pro at highly complex tasks such as following nuanced instructions, collaborative coding, and detailed reasoning, and that it would gain features over time, including the ability to handle longer interactions and larger amounts of context.^[3]^[7]

How did Gemini Ultra perform on benchmarks?

Google's launch materials leaned heavily on benchmark comparisons against GPT-4, reporting that Ultra's performance exceeded the prior state of the art on 30 of the 32 widely used academic benchmarks.^[1]^[2] The most-publicized figure was a 90.0% score on Massive Multitask Language Understanding (MMLU), a 57-subject test of knowledge and problem-solving. Google called Ultra "the first model to outperform human experts" on MMLU, citing a human-expert baseline of 89.8%.^[1]^[2]^[8]

That claim carried an important caveat that drew contemporary criticism. The 90.0% result (reported as 90.04% in the technical report) was obtained using a method Google called uncertainty-routed chain-of-thought with 32 samples, abbreviated CoT@32, in which the model generates many reasoning chains and selects an answer based on its confidence. On the same CoT@32 method, GPT-4 scored 87.29%. But when both models were compared with the more conventional 5-shot prompting setup, Ultra scored 83.7% and GPT-4 scored 86.4%, meaning GPT-4 actually came out ahead on the apples-to-apples comparison.^[6]^[9] Critics, including commentators on Hacker News and academics quoted in trade coverage, argued that comparing Ultra's best-case prompting strategy against GPT-4's standard one made the headline table misleading.^[9]^[10] Reviewers also noted that across many benchmarks the margins were narrow, and that on at least one commonsense-reasoning test (HellaSwag) Ultra trailed GPT-4 by a wide margin.^[5]

The table below summarizes figures from Google's December 2023 technical report. Where two methods are shown for MMLU, both models are compared under each method.

Benchmark	Gemini Ultra	GPT-4	Notes
MMLU (CoT@32, uncertainty-routed)	90.04%	87.29%	Basis for the "first to beat human experts" claim; human-expert baseline 89.8%
MMLU (5-shot)	83.7%	86.4%	Standard prompting; GPT-4 scores higher
MMMU (multimodal)	59.4%	56.8%	GPT-4 figure is GPT-4V; routing method boosts Ultra to 62.4%
GSM8K (grade-school math)	94.4%	92.0%	Math word problems
MATH (competition math)	53.2%	52.9%
HumanEval (code)	74.4%	67.0%	Python code generation

Sources: Gemini technical report and Google's announcement.^[2]^[6]^[8]

How much did Gemini Ultra cost, and how was it sold?

Ultra reached the public on February 8, 2024. On the same day, Google renamed Bard to Gemini across its services and launched Gemini Advanced, "a new experience that gives you access to Ultra 1.0." Access was sold through a new Google One AI Premium plan priced at $19.99 per month in the United States, which also bundled 2 TB of storage and the rest of Google One's features, and which opened with a two-month free trial.^[3]^[7]^[11] The price matched OpenAI's ChatGPT Plus subscription.^[12]

Gemini Advanced launched in more than 150 countries and territories, initially in English only, with Japanese and Korean named as the next languages on the roadmap. Google also introduced a dedicated Gemini app for Android, which could replace Google Assistant, and brought Gemini to the Google app on iOS in the following weeks.^[3]^[7]^[11] At the 2025 Google I/O conference, the Google One AI Premium plan and the Gemini Advanced brand were folded into a new tier called Google AI Pro, while the $19.99 monthly price point in the US remained.^[13]

How was Gemini Ultra received?

Press coverage of Ultra was attentive but skeptical. TechCrunch wrote that Ultra "scores only marginally better than GPT-4" across many benchmarks and pointed to the narrow gaps: 94.4% versus 92.0% on math, 82.4% versus 80.9% on one reading-comprehension test, and just 0.6 percentage points on an image-understanding task, while noting Ultra fell "a fair bit behind" GPT-4 on commonsense reasoning.^[5] The MMLU methodology mismatch became the central technical complaint, with observers stressing that the dramatic 90% figure required the multi-sample CoT@32 setup rather than a like-for-like evaluation.^[9]^[10]

Once Gemini Advanced was in users' hands, the reaction was mixed. A number of testers reported that the product did not feel decisively better than GPT-4 in everyday use, and some pointed to overcautious refusals on sensitive topics.^[12]^[14] The mismatch between benchmark claims and lived experience became a recurring theme in commentary about the model.

What replaced Gemini Ultra?

No successor model named Ultra was ever released. When Google moved to its next generation, Gemini 1.5, it shipped Gemini 1.5 Pro and a smaller Gemini 1.5 Flash but never a 1.5 Ultra, and the company did not commit to whether an Ultra-class 1.5 model would arrive. The same pattern held through the 2.x generations, which centered on Pro and Flash variants alongside dedicated reasoning configurations. In effect, the Ultra tier as a distinct model line ended with version 1.0, with its role absorbed by improved Pro models and, later, by reasoning-focused systems.^[15]^[16]

The "Ultra" name did resurface in 2025, but as a subscription tier rather than a model: Google AI Ultra, a high-end plan introduced at I/O 2025, bundled access to Google's most advanced models and features at a far higher price than the original Gemini Advanced subscription. The underlying models in that plan were later Gemini versions, not a revived Gemini Ultra model.^[13]^[16] The original Gemini 1.0 models, including Ultra, were eventually retired as Google's API shifted to newer generations.^[16]

References

Google, "Introducing Gemini: our largest and most capable AI model," The Keyword (Google blog), December 6, 2023. https://blog.google/technology/ai/google-gemini-ai/ ↩
Google DeepMind, "Introducing Gemini, our largest and most capable AI model" (announcement summary), December 6, 2023. https://blog.google/technology/ai/google-gemini-ai/ ↩
Sissie Hsiao, "Bard becomes Gemini: Try Ultra 1.0 and a new mobile app today," The Keyword (Google blog), February 8, 2024. https://blog.google/products/gemini/bard-gemini-advanced-app/ ↩
"Gemini (language model)," Wikipedia. https://en.wikipedia.org/wiki/Gemini_(language_model) ↩
Kyle Wiggers, "Google's Gemini isn't the generative AI model we expected," TechCrunch, December 6, 2023. https://techcrunch.com/2023/12/06/googles-gemini-isnt-the-generative-ai-model-we-expected/ ↩
Gemini Team, Google, "Gemini: A Family of Highly Capable Multimodal Models," arXiv:2312.11805, December 2023. https://arxiv.org/abs/2312.11805 ↩
Frederic Lardinois, "Google goes all in on Gemini and launches $20 paid tier for Gemini Ultra," TechCrunch, February 8, 2024. https://techcrunch.com/2024/02/08/google-goes-all-in-on-gemini-and-launches-20-paid-tier-for-gemini-ultra/ ↩
"Gemini AI model parameters and performance benchmarks," ML Journey. https://mljourney.com/gemini-ai-model-parameters-and-performance-benchmarks/ ↩
"The table is highly misleading. It uses different methodologies all over the place," Hacker News discussion, December 2023. https://news.ycombinator.com/item?id=38546442 ↩
GlobalData, "Google Gemini AI evokes debates among influencers on evaluation criteria and GPT-4 comparison," December 2023. https://www.globaldata.com/media/business-fundamentals/google-gemini-ai-evokes-debates-among-influencers-on-evaluation-criteria-and-gpt-4-comparison-finds-globaldata/ ↩
Abner Li, "Google One AI Premium is $19.99/mo with Gemini Advanced & Gemini for Workspace," 9to5Google, February 8, 2024. https://9to5google.com/2024/02/08/google-one-ai-premium/ ↩
Alberto Romero, "How Good Is Google Gemini Advanced?," The Algorithmic Bridge, February 2024. https://www.thealgorithmicbridge.com/p/how-good-is-google-gemini-advanced ↩
"What Gemini features you get with Google AI Pro and AI Ultra," 9to5Google, January 16, 2026. https://9to5google.com/2026/01/16/google-ai-pro-ultra-features/ ↩
Kyle Wiggers, "We tested Google's Gemini chatbot, here's how it performed," TechCrunch, February 15, 2024. https://techcrunch.com/2024/02/15/we-tested-googles-gemini-chatbot-heres-how-it-performed/ ↩
Google Developers Blog, "Gemini 1.5 Pro and 1.5 Flash now available," 2024. https://developers.googleblog.com/en/gemini-15-pro-and-15-flash-now-available/ ↩
Google, "Models," Gemini API documentation. https://ai.google.dev/gemini-api/docs/models ↩

Improve this article

Add missing citations, update stale details, or suggest a clearer explanation. Every suggestion is reviewed for sourcing before it goes live.

1 revision by 1 contributors · full history

Suggest edit

What links here

Gemini 1.0 Gemini 1.5 Pro Gemini Advanced Machine learning terms/Google Cloud Reka Core

What is Gemini Ultra?

Background

What could Gemini Ultra do?

How did Gemini Ultra perform on benchmarks?

How much did Gemini Ultra cost, and how was it sold?

How was Gemini Ultra received?

What replaced Gemini Ultra?

References

Improve this article

Related Articles

Gemini 3

Gemma 3

Gemini 2.5 Flash

Gemini 3.5 Flash

Gemini 3.1 Pro

Gemini 1.5 Pro

What links here

Related Articles

Gemini 3

Gemma 3

Gemini 2.5 Flash

Gemini 3.5 Flash

Gemini 3.1 Pro

Gemini 1.5 Pro

What links here