Gemini 2.0 Flash

Google DeepMind Large Language Models Multimodal AI

9 min read

Updated Jun 24, 2026

Suggest edit History Talk

RawGraph

Last edited

Jun 24, 2026

Fact-checked

In review queue

Sources

15 citations

Revision

v2 · 1,770 words

Fact-checks are independent of edits: a reviewer re-verifies the article against its sources and stamps the date. How we verify

Gemini 2.0 Flash is a fast, low-cost multimodal large language model built by Google DeepMind as the flagship workhorse of the Gemini 2.0 generation, designed for the agentic era with native tool use, a 1 million token context window, and the ability to generate images and audio directly. Google introduced it on December 11, 2024 and stated that the model "even outperforms 1.5 Pro on key benchmarks, at twice the speed" while costing far less to run ^[1]. It launched first as an experimental preview (gemini-2.0-flash-exp) and reached general availability on February 5, 2025, priced at $0.10 per million input tokens and $0.40 per million output tokens ^[1]^[3]^[7].

Gemini 2.0 Flash is distinguished from earlier Gemini models by native multimodal output: a single model accepts text, images, audio, and video as input and can return text, generated images, and synthesized speech without routing to separate systems. It powers Google's agentic prototypes, including Project Astra and Project Mariner, and was succeeded by the reasoning-native Gemini 2.5 line in March 2025.

Background

Gemini 2.0 Flash succeeded Gemini 1.5 Flash, the fast, cost-efficient tier of the previous generation. Google framed the new model as keeping the speed of 1.5 Flash while pushing quality upward, and it stated that 2.0 Flash "even outperforms 1.5 Pro on key benchmarks, at twice the speed" ^[1]. The launch was led by Google DeepMind CEO Demis Hassabis and CTO Koray Kavukcuoglu, who positioned the 2.0 family around three themes: multimodality, agentic capabilities, and speed ^[1]^[2]. Announcing the release, Google described the goal in terms of agents, writing that "with new advances in multimodality, like native image and audio output, and native tool use, it will enable us to build new AI agents that bring us closer to our vision of a universal assistant" ^[1].

The December 11 release was deliberately staged. Developers could begin testing an experimental build (gemini-2.0-flash-exp) immediately, while a chat-optimized version appeared in the consumer Gemini product the same day ^[1]^[4]. Broader general availability, additional sizes, and the more advanced output modalities were promised for early 2025 ^[1].

What can Gemini 2.0 Flash do?

Gemini 2.0 Flash accepts text, images, audio, and video as input. Its defining feature relative to earlier Gemini models is native multimodal output: the model can produce text, generate images, and synthesize speech from a single model rather than routing requests to a separate image or text-to-speech system ^[1]^[2].

Native multimodal output

At announcement, Google described two new output modes. The first was native image generation, in which the model interleaves generated images with text and supports multi-turn conversational editing, where a user can refine an image across successive prompts ^[1]^[5]. The second was steerable, multilingual text-to-speech audio, with Google citing a set of high-quality voices spanning multiple languages ^[1]^[2]. All generated images and audio carry an invisible SynthID watermark for provenance ^[1].

These output features were restricted to early-access partners at launch rather than being broadly available. Google opened native image generation to general developer experimentation through Google AI Studio and the Gemini API on March 12, 2025 ^[5]^[6]. The image, audio, and live-streaming capabilities were described as reaching wider availability "in the coming months" after the February general-availability milestone, which initially shipped with text output ^[3]^[7].

Native tool use

Gemini 2.0 Flash can call tools on its own as part of generating a response. Supported tools include Google Search, code execution, and developer-defined functions via function calling ^[2]^[3]. Google noted that the model can issue multiple searches in parallel to gather and combine information, which it presented as a building block for agent-style applications ^[2]. Alongside the model, Google released a Multimodal Live API that handles real-time audio and video streaming input, with support for natural conversational patterns such as interruptions and voice-activity detection ^[2].

How large is the context window?

Gemini 2.0 Flash provides a context window of 1 million tokens for input ^[3]^[7]. That matched the entry tier of the Gemini 1.5 line and was carried over to the budget-oriented Gemini 2.0 Flash-Lite. The larger Gemini 2.0 Pro Experimental, released the same day as the Flash general-availability launch, doubled this to 2 million tokens ^[3]^[7].

How does Gemini 2.0 Flash perform on benchmarks?

Google reported that Gemini 2.0 Flash improved on Gemini 1.5 Flash across a range of academic benchmarks and beat the larger Gemini 1.5 Pro on several of them while running roughly twice as fast ^[1]. The figures below are drawn from Google's published comparison for the experimental release. Where a directly comparable 1.5 Pro score was published, it is shown for reference ^[8]^[9].

Benchmark	Capability	Gemini 2.0 Flash	Gemini 1.5 Pro
MMLU-Pro	General knowledge and reasoning	76.4%	75.8%
Natural2Code	Code generation	92.9%	85.4%
MATH	Mathematics	89.7%	not published in set
MMMU	Multimodal (image) reasoning	70.7%	not published in set
FACTS Grounding	Factual grounding	83.6%	not published in set
CoVoST2	Audio translation	71.5%	not published in set

Independent reporting on the experimental model corroborated the headline MMLU-Pro, Natural2Code, MATH, MMMU, and FACTS Grounding figures ^[8]^[9]. Google did not publish a full 1.5 Pro column for every benchmark in the launch comparison, so several reference cells are left blank rather than estimated.

A separate reasoning-focused variant, Gemini 2.0 Flash Thinking Experimental, was released on December 19, 2024. Built on the 2.0 Flash base, it exposes a summary of its step-by-step reasoning before answering and was widely compared to OpenAI's o1 family of reasoning models ^[10]. Jeff Dean, Google DeepMind's chief scientist, described it as "an experimental model that explicitly shows its thoughts" and "trained to use thoughts to strengthen its reasoning" ^[14].

When was Gemini 2.0 Flash released, and how much does it cost?

During the experimental phase, developers reached Gemini 2.0 Flash through the Gemini API in Google AI Studio and in Vertex AI ^[1]^[2]. On the consumer side, a chat-optimized version became selectable in the model drop-down of the Gemini app on desktop and mobile web from December 11, 2024, for both standard and Advanced users, with the mobile app following shortly after ^[1]^[4]. According to Wikipedia's account of the rollout, the model became the default in the Gemini app on January 30, 2025 ^[11].

At general availability on February 5, 2025, Google described 2.0 Flash as "our highly efficient workhorse model for developers with low latency and enhanced performance" and said it shipped with higher rate limits, stronger performance, and simplified pricing across Google AI Studio, Vertex AI, and the Gemini app ^[3]^[7]. The general-availability pricing was set at $0.10 per million input tokens and $0.40 per million output tokens, replacing the earlier split between short and long context tiers ^[7]^[12]. The table below summarizes the launch pricing for the 2.0 Flash tiers.

Model	Status on Feb 5, 2025	Input ($/M tokens)	Output ($/M tokens)	Context window
Gemini 2.0 Flash	Generally available	$0.10	$0.40	1M tokens
Gemini 2.0 Flash-Lite	Public preview	$0.075	$0.30	1M tokens
Gemini 2.0 Pro Experimental	Experimental	not priced at launch	not priced at launch	2M tokens

What powers Project Astra and Project Mariner?

Google presented Gemini 2.0 Flash as the engine behind several research prototypes shown at the December 2024 launch. Project Astra is a research effort toward a universal AI assistant that uses the model's real-time multimodal understanding through a phone's camera and microphone ^[1]. Project Mariner is an experimental agent that operates inside the Chrome browser to complete tasks on a user's behalf; Google reported that it reached a state-of-the-art result of 83.5% on the WebVoyager benchmark, which evaluates agents on end-to-end real-world web tasks ^[1]. A coding agent prototype, Jules, was also introduced for developers ^[1].

Variants and successors

The Gemini 2.0 family grew over the following weeks. On February 5, 2025, Google released Gemini 2.0 Flash-Lite into public preview as its most cost-efficient model, offering better quality than 1.5 Flash at the same speed and cost, with the same 1 million token context window ^[3]^[7]. Flash-Lite then reached general availability for production use in Google AI Studio and Vertex AI on February 25, 2025 ^[15]. The February 5 announcement also brought Gemini 2.0 Pro Experimental, aimed at coding and complex prompts, with a 2 million token context window and tool calling for Google Search and code execution ^[3]^[7].

The next generation arrived with Gemini 2.5. Google released Gemini 2.5 Pro Experimental on March 25, 2025, a model that built reasoning into the architecture while retaining native multimodality, and the line later became the recommended migration target for 2.0 users ^[11]. Google subsequently scheduled the gemini-2.0-flash and gemini-2.0-flash-lite identifiers for shutdown on June 1, 2026, directing developers toward Gemini 2.5 Pro and the 2.5 Flash tiers ^[13].

References

Google. "Introducing Gemini 2.0: our new AI model for the agentic era." The Keyword (Google blog), December 11, 2024. blog.google ↩
Google for Developers. "The next chapter of the Gemini era for developers." Google Developers Blog, December 11, 2024. developers.googleblog.com ↩
Google. "Gemini 2.0 is now available to everyone." The Keyword (Google blog), February 5, 2025. blog.google ↩
Abner Li. "Gemini 2.0 Flash announced with agentic focus, coming to app." 9to5Google, December 11, 2024. 9to5google.com ↩
Google for Developers. "Experiment with Gemini 2.0 Flash native image generation." Google Developers Blog, March 12, 2025. developers.googleblog.com ↩
Abner Li. "You can now test Gemini 2.0 Flash's native image output." 9to5Google, March 12, 2025. 9to5google.com ↩
Google for Developers. "Gemini 2.0: Flash, Flash-Lite and Pro." Google Developers Blog, February 5, 2025. developers.googleblog.com ↩
DeepLearning.AI. "Google Introduces Gemini 2.0 Flash, a Faster, More Capable AI Model." The Batch. deeplearning.ai ↩
Analytics Vidhya. "Gemini 2.0 by Google is Here: Faster and Smarter than Ever Before." December 2024. analyticsvidhya.com ↩
Kyle Wiggers. "Google releases its own 'reasoning' AI model." TechCrunch, December 19, 2024. techcrunch.com ↩
Wikipedia contributors. "Gemini (language model)." Wikipedia. en.wikipedia.org) ↩
Helicone. "Gemini 2.0 Flash Explained: Building More Reliable Applications." helicone.ai ↩
Google. "Gemini API deprecations." Gemini API documentation. ai.google.dev ↩
Jeff Dean. "Introducing Gemini 2.0 Flash Thinking, an experimental model that explicitly shows its thoughts." X (formerly Twitter), December 19, 2024. x.com ↩
Google for Developers. "Start building with Gemini 2.0 Flash and Flash-Lite." Google Developers Blog, February 25, 2025. developers.googleblog.com ↩

Improve this article

Add missing citations, update stale details, or suggest a clearer explanation. Every suggestion is reviewed for sourcing before it goes live.

1 revision by 1 contributors · full history

Suggest edit

What links here

AI Model Release Timeline (2022-2026)AI co-scientist (Google)ERQA Gemini 1.5 Flash Gemini 1.5 Pro Gemini 2.0 Flash Thinking Gemini Diffusion Gemini Live Project Astra

Background

What can Gemini 2.0 Flash do?

Native multimodal output

Native tool use

How large is the context window?

How does Gemini 2.0 Flash perform on benchmarks?

When was Gemini 2.0 Flash released, and how much does it cost?

What powers Project Astra and Project Mariner?

Variants and successors

References

Improve this article

Related Articles

Gemini 3

Gemma 3

Gemini 2.5 Flash

Gemini 3.5 Flash

Gemini 3.1 Pro

Gemini 1.5 Pro

What links here

Related Articles

Gemini 3

Gemma 3

Gemini 2.5 Flash

Gemini 3.5 Flash

Gemini 3.1 Pro

Gemini 1.5 Pro

What links here