Gemini 2.0 Flash Thinking

Google DeepMind Large Language Models Reasoning Models

7 min read

Updated Jun 3, 2026

Suggest edit History Talk

RawGraph

Last edited

Jun 3, 2026

Fact-checked

In review queue

Sources

12 citations

Revision

v1 · 1,305 words

Fact-checks are independent of edits: a reviewer re-verifies the article against its sources and stamps the date. How we verify

Gemini 2.0 Flash Thinking is an experimental reasoning model released by Google as part of the Gemini 2.0 family. First made available on December 19, 2024, it was Google's first model trained to generate an explicit "thinking" process before producing a final answer, a technique aimed at improving accuracy on multi-step problems in mathematics, science, and coding ^[1]^[2]. Built on top of Gemini 2.0 Flash, it was offered as a free experimental option in Google AI Studio and through the Gemini API, and it was widely viewed as Google's answer to OpenAI's o1 family of reasoning models ^[1]^[3]. The model later served as a precursor to the thinking capabilities that Google DeepMind built natively into the Gemini 2.5 generation ^[4].

Background

By late 2024, several AI labs were exploring "inference-time" or "test-time" scaling, in which a model spends additional computation reasoning through a problem before answering rather than producing a response in a single pass. OpenAI's o1, released earlier in 2024, was the most prominent example. Google introduced Gemini 2.0 Flash Thinking on December 19, 2024, the same month it unveiled the broader Gemini 2.0 line, positioning it as its first entrant in this class of reasoning models ^[1]^[3]. Jeff Dean, Google's chief scientist, described it as a model "trained to use thoughts to strengthen its reasoning" ^[2].

The initial release carried the model identifier gemini-2.0-flash-thinking-exp-1219 and was clearly labeled experimental, with Google cautioning that it was an early version ^[2]^[5]. Because it was derived from Gemini 2.0 Flash, it inherited that model's multimodal input support, including the ability to accept images alongside text ^[5].

How it works (visible reasoning)

The defining feature of Gemini 2.0 Flash Thinking is that, given a prompt, it pauses to work through the problem, considers intermediate steps and alternatives, and "explains" its reasoning before summarizing what it judges to be the most accurate answer ^[1]^[2]. Rather than only returning a final response, it exposes the chain of intermediate reasoning, which Google argued made the process more transparent and helped the model catch its own mistakes ^[2]^[6].

This reasoning was surfaced to users in two ways. In Google AI Studio, selecting the model enabled a dedicated "Thoughts" panel that displayed how the model broke down and worked through a task ^[6]. Through the Gemini API, the thinking content was returned as part of the response, appearing as the first element of the response content, so developers could inspect the thought summaries to understand how the model arrived at its conclusion ^[6].

Because it generated additional reasoning tokens before answering, the model produced longer outputs and took more time to respond than the base Flash model, with latencies ranging from seconds to minutes depending on the prompt ^[1]. It was also still error-prone in its early form. Contemporary coverage noted it could fail simple tasks, for instance miscounting the number of "R" letters in the word "strawberry," and could occasionally emit slightly malformed output ^[1]^[5].

Versions and updates

Google shipped two main experimental versions. The original December release had a relatively small context window, and the January update expanded it substantially while adding native code execution.

Version	Model ID	Released	Context window	Notable additions
Initial	gemini-2.0-flash-thinking-exp-1219	Dec 19, 2024	32,000 tokens	First public reasoning model from Google ^[1]^[5]^[7]
Updated	gemini-2.0-flash-thinking-exp-01-21	Jan 21, 2025	1,000,000 tokens	Native code execution; stronger benchmarks; fewer contradictions between reasoning and answer ^[7]^[8]^[9]

The January 21, 2025 update increased the context window from 32,000 tokens to one million tokens, allowing the model to ingest large inputs such as an entire codebase or a collection of research papers ^[7]^[8]. It added native code execution as a tool, letting the model write and run code during its reasoning, and Google reported reduced instances of the model contradicting itself between its intermediate reasoning and its final answer ^[7]^[9]. The experimental Flash Thinking models remained free to test in Google AI Studio and via the API throughout this period ^[3]^[8].

Performance

Google reported gains across math, science, and multimodal benchmarks between the December and January versions. The figures below reflect Google's reported results for the two experimental releases as cited in contemporary coverage.

Benchmark	Domain	Exp-1219 (Dec 2024)	Exp-01-21 (Jan 2025)
AIME 2024	Mathematics	about 70%	73.3% ^[8]^[9]
GPQA Diamond	Science	about 66%	74.2% ^[8]^[9]
MMMU	Multimodal reasoning	not reported here	75.4% ^[7]^[9]

Press coverage framed the model's appeal partly on cost and capacity. VentureBeat noted that the free, one-million-token model contrasted sharply with paid premium reasoning offerings, processing far more context than some competing reasoning products while keeping response times relatively fast ^[3]. As with all early reasoning systems, these benchmark numbers came from the model developer and should be read as self-reported.

Availability

Gemini 2.0 Flash Thinking was released as an experimental model rather than a generally available production one. It could be selected from the model dropdown in Google AI Studio and called through the Gemini API using identifiers such as gemini-2.0-flash-thinking-exp or the dated variants gemini-2.0-flash-thinking-exp-1219 and gemini-2.0-flash-thinking-exp-01-21 ^[6]^[8]. The experimental tier was free but rate limited, with the December version constrained to a small number of requests per minute and per day ^[10].

Being experimental, it carried no stability guarantees. Google later folded thinking into its mainline lineup, and the standalone Flash Thinking experimental models were retired as the Gemini 2.5 generation and its successors took over reasoning duties ^[4]^[11].

Relationship to Gemini 2.5

Gemini 2.0 Flash Thinking was the direct ancestor of the reasoning approach that defined the next generation. When Google announced Gemini 2.5 Pro on March 25, 2025, it explicitly referred back to the earlier model, stating, "we recently introduced our first thinking model, Gemini 2.0 Flash Thinking" ^[4]. The key shift was that thinking moved from a separate experimental variant to a capability built into the models themselves. Google said it was "building these thinking capabilities directly into all of our models," and described Gemini 2.5 as a family of "thinking models" that reason through problems before responding by default ^[4].

Where Gemini 2.0 Flash Thinking was strongest in mathematics and coding, the Gemini 2.5 series generalized native thinking across domains and combined it with multimodal input and long context windows, while the underlying training recipe evolved from the original experiment ^[4]^[12]. The Gemini API documentation for thinking later centered on the Gemini 2.5 and 3 series as the production thinking models, with the 2.0 experimental version no longer the recommended path ^[11]. In this sense Gemini 2.0 Flash Thinking functioned as a proof of concept that validated visible, test-time reasoning for Google before the company made it a standard feature of its models.

References

Improve this article

Add missing citations, update stale details, or suggest a clearer explanation. Every suggestion is reviewed for sourcing before it goes live.

Suggest edit

What links here

OpenAI o1

Background

How it works (visible reasoning)

Versions and updates

Performance

Availability

Relationship to Gemini 2.5

References

Improve this article

Related Articles

Gemini 2.5 Deep Think

AlphaGeometry

AlphaProof

BIG-Bench Extra Hard

OpenAI o1

OpenAI o3

What links here

Related Articles

Gemini 2.5 Deep Think

AlphaGeometry

AlphaProof

BIG-Bench Extra Hard

OpenAI o1

OpenAI o3