GPT Image 2

AI Models Image Generation OpenAI

9 min read

Updated Jun 28, 2026

Suggest edit History Talk

RawGraph

Last edited

Jun 28, 2026

Fact-checked

In review queue

Sources

13 citations

Revision

v2 · 1,775 words

Fact-checks are independent of edits: a reviewer re-verifies the article against its sources and stamps the date. How we verify

GPT Image 2 (API model gpt-image-2, marketed inside the product as ChatGPT Images 2.0) is an image-generation model released by OpenAI on April 21, 2026. It is OpenAI's first image model that "thinks" before it draws, running a reasoning pass to plan layout, spatial relationships, and text placement before any pixels are generated. It can also search the web mid-generation, produce up to eight images from a single prompt, and double-check its own output. ^[1]^[2]^[3] The model succeeds GPT Image 1 (March 2025) and the interim GPT Image 1.5 (December 2025), and within hours of release it took the top spot across every category of the independent Image Arena leaderboard by the largest margin the leaderboard had ever recorded. ^[4]^[5] The production API snapshot is identified as gpt-image-2-2026-04-21. ^[2]^[3]

The launch arrived in the middle of a competitive sprint. OpenAI had declared an internal "code red" in early December 2025 after Google's Gemini 3 and the viral Nano Banana image models cut into ChatGPT's lead, and CEO Sam Altman had named image generation a priority. ^[6] GPT Image 1.5 was the first counterpunch; GPT Image 2 was the heavier one, and it landed against Google's Nano Banana 2 as the model to beat. ^[4]^[7]

Key facts

Attribute	Detail
Developer	OpenAI
Model name	`gpt-image-2` (API snapshot `gpt-image-2-2026-04-21`; product brand: ChatGPT Images 2.0)
Released	April 21, 2026; broad ChatGPT and Codex rollout April 22 ^[1]^[2]
Predecessor	GPT Image 1.5 (`gpt-image-1.5`, Dec 2025); earlier GPT Image 1
Type	Native multimodal image generation with a reasoning step
Max resolution	Up to 2K in ChatGPT; higher resolutions (reported up to 4K) offered in beta via the API ^[2]^[3]
Aspect ratios	Several presets from ultra-wide 3:1 to ultra-tall 1:3 ^[1]^[3]
Modes	Instant (fast) and Thinking (reasoning, web search) ^[3]
Availability	All ChatGPT plans, Codex, and the API ^[1]^[2]
Knowledge cutoff	December 2025 ^[2]

What is GPT Image 2?

GPT Image 2 is a native multimodal generator, meaning the same model that understands a text prompt is the one that renders the picture, rather than handing the request off to a separate system. That design carries over from GPT Image 1 and 1.5. What changed is the addition of a reasoning step. Before generating, the model interprets what the user actually wants, considers spatial relationships, plans where text should sit, and works through the visual logic of the scene. ^[1]^[8] OpenAI says the model "has an improved sense of composition and visual taste, which OpenAI says will result in images that feel less AI-generated." ^[9]

OpenAI declined to confirm the underlying architecture. It would not say whether gpt-image-2 is diffusion-based, autoregressive, or a hybrid, describing it instead as a new generalist model and, informally, as a "GPT for images." Several outlets noted the shift away from the earlier two-stage inference pipeline toward a single-pass generator, but the company has not published a technical paper or system card with architecture details as of this writing. ^[2]^[8]

How does GPT Image 2 differ from gpt-image-1 and DALL-E?

The lineage runs from DALL-E to a native pipeline and now to a reasoning pipeline. The original DALL-E models, including DALL-E 3, worked as a separate diffusion service that ChatGPT called when a user asked for a picture. GPT Image 1 collapsed that separation by building generation directly into the GPT-4o model, so images and text shared one context. GPT Image 1.5 moved the same idea onto the GPT-5 model and made edits up to four times faster. GPT Image 2 keeps the native design and adds the thinking step plus web search. ^[9]^[3]

Model	Released	Key change
DALL-E 3	2023	Separate diffusion service invoked by ChatGPT
GPT Image 1	Mar/Apr 2025	Generation built natively into GPT-4o
GPT Image 1.5	Dec 2025	GPT-5 backbone, much faster editing
GPT Image 2	Apr 2026	Reasoning before generation, web search, up to 2K

The most visible practical gap is text. DALL-E 3 routinely misspelled words in images. A widely shared comparison used a Mexican restaurant menu: the 2024 model produced items like "enchuita" and "churiros," while GPT Image 2 rendered the menu cleanly. ^[2] The other big difference is agency. Where earlier models took a prompt and produced an image in one shot, GPT Image 2 can pause to research, batch several candidates, and verify the result against the prompt before returning it. ^[3]^[8]

What can GPT Image 2 do?

Reasoning is the headline feature. In Thinking mode the model spends more time planning, which OpenAI says yields better instruction-following, more accurate object placement, and steadier composition than its predecessors. The same mode unlocks web search, so the model can pull real-time information into a generation, for example current facts for an infographic. ^[1]^[3]^[8]

Text rendering is the most cited concrete improvement. GPT Image 2 reads the prompt, plans the layout, and renders legible type across non-Latin scripts including Japanese, Korean, Chinese, Hindi, and Bengali, without the warped letters and invented glyphs that affected earlier generators. Coverage citing OpenAI's figures reported roughly 99 percent character-level accuracy across Latin, CJK, Hindi, and Bengali scripts. ^[1]^[2]^[7] Reviewers reported that it handles dense, text-heavy outputs such as full infographics, slides, maps, and multi-panel comics with surprising accuracy. ^[10] The model still has limits: very long blocks of text can break down after a few hundred characters, and fine details like exact logo geometry remain unreliable. ^[3]

On resolution and layout, ChatGPT outputs go up to 2K across several aspect ratios from 3:1 to 1:3, with the API offering higher resolutions in beta. ^[2]^[3] The model is built for editing as well as generation, handling small text, icons, UI elements, and subtle stylistic instructions, and it can produce up to eight images from one prompt. ^[1]^[3]

Where is GPT Image 2 available, and how much does it cost?

ChatGPT Images 2.0 rolled out to all ChatGPT and Codex users beginning April 22, 2026, with API access opening at the same time. ^[1]^[2] Instant mode is free for everyone. Thinking mode, with its extended reasoning and web search, is reserved for paid subscribers on the Plus, Pro, Business, and Enterprise plans. ^[2]^[3]

In the API, gpt-image-2 uses token-based pricing that scales with quality and resolution. Reported figures put image input at about $8 per million tokens, cached image input at about $2 per million tokens, and image output in the region of $30 per million tokens. A standard 1024 by 1024 image at high quality works out to roughly $0.21, with low quality near $0.006 and medium quality near $0.05. ^[3] The Batch API halves the token rates in exchange for up to 24-hour latency, and reused text inputs are cached at a lower rate. ^[3] At larger sizes the per-image cost can fall below the equivalent on GPT Image 1.5, while at the standard 1024 by 1024 high-quality setting it runs somewhat higher. ^[3]

How does GPT Image 2 rank on benchmarks?

The reception was driven by the Image Arena leaderboard run by Arena (formerly LMArena), which ranks models by blind human preference votes. Within about 12 hours of release, GPT Image 2 swept first place across all three Arena categories: Text-to-Image, Single-Image Edit, and Multi-Image Edit. In Text-to-Image it scored 1512, ahead of the next model, Nano Banana 2 at roughly 1271, a lead of about 242 Elo points that Arena called the largest gap it had ever seen. ^[4]^[5]^[7] Coverage citing Arena's data put the model's blind-comparison win rate near 93 percent. ^[7]

The picture on Artificial Analysis, which runs its own image arena, is more measured. Later snapshots there placed a gpt-image-2 variant well inside the top tier but not as a runaway leader, a reminder that different leaderboards use different prompts, voters, and quality settings. ^[11] For context, GPT Image 1.5 had also reached the top of the Arena in December 2025 yet never caught on with consumers, so a high ranking did not by itself guarantee adoption. ^[7]

How was GPT Image 2 received?

Press coverage was broadly positive and focused on two things: the thinking step and text. TechCrunch led with how good the model is at generating text, the weakness that had defined AI image tools. ^[2] The New Stack and The Decoder framed the launch around the idea that OpenAI now reasons before it draws, putting gpt-image-2 in the same conceptual lane as Google's Nano Banana Pro, which had popularized clean in-image text months earlier. ^[8]^[12] VentureBeat described near-flawless multilingual text, infographics, slides, maps, and even manga. ^[10]

The competitive subtext was hard to miss. Reporting tied GPT Image 2 directly to the December "code red" and the race with Google, casting it as the product that finally let OpenAI reclaim attention in image generation after Nano Banana's viral run. ^[6]^[7] OpenAI followed the launch by retiring its older DALL-E 2 and DALL-E 3 models on May 12, 2026, closing the book on the diffusion-service era that GPT Image 2 had superseded. ^[7]^[13]

References

OpenAI. "Introducing ChatGPT Images 2.0." https://openai.com/index/introducing-chatgpt-images-2-0/ ↩
Maxwell Zeff. "ChatGPT's new Images 2.0 model is surprisingly good at generating text." TechCrunch, April 21, 2026. https://techcrunch.com/2026/04/21/chatgpts-new-images-2-0-model-is-surprisingly-good-at-generating-text/ ↩
Michael Nuñez. "ChatGPT Images 2.0: Full Developer Breakdown (2026)." Build Fast with AI. https://www.buildfastwithai.com/blogs/chatgpt-images-2-0-gpt-image-2-2026 ↩
Arena (@arena). "GPT-Image-2 by @OpenAI has claimed the #1 spot across all Image Arena leaderboards." X, April 21, 2026. https://x.com/arena/status/2046670703311884548 ↩
Arena. "Text-to-Image Leaderboard." https://arena.ai/leaderboard/text-to-image ↩
Maxwell Zeff. "OpenAI continues on its 'code red' warpath with new image generation model." TechCrunch, December 16, 2025. https://techcrunch.com/2025/12/16/openai-continues-on-its-code-red-warpath-with-new-image-generation-model/ ↩
"GPT Image 2 Dominates Rankings and Leads Google in Comeback, 5 Months After Altman's 'Red Alert'." 36Kr, 2026. https://eu.36kr.com/en/p/3784950967376900 ↩
"With the launch of ChatGPT Images 2.0, OpenAI now 'thinks' before it draws." The New Stack, April 2026. https://thenewstack.io/chatgpt-images-20-openai/ ↩
"OpenAI Launches ChatGPT Images 2.0 With Thinking Capabilities and Better Text Rendering." MacRumors, April 22, 2026. https://www.macrumors.com/2026/04/22/openai-chatgpt-images-2-0/ ↩
"OpenAI's ChatGPT Images 2.0 is here and it does multilingual text, full infographics, slides, maps, even manga." VentureBeat, April 2026. https://venturebeat.com/technology/openais-chatgpt-images-2-0-is-here-and-it-does-multilingual-text-full-infographics-slides-maps-even-manga-seemingly-flawlessly ↩
Artificial Analysis. "GPT Image 2 (high): Model Elo, Generation Time & Price Analysis." https://artificialanalysis.ai/image/models/gpt-image-2 ↩
"OpenAI's ChatGPT Images 2.0 thinks before it generates, adding reasoning and web search to image creation." The Decoder, April 2026. https://the-decoder.com/openais-chatgpt-images-2-0-thinks-before-it-generates-adding-reasoning-and-web-search-to-image-creation/ ↩
"DALL-E Is Dead: OpenAI Retires Its Image Models on May 12." Genra.ai, 2026. https://genra.ai/blog/dall-e-retired-may-2026-what-replaces-it ↩

Improve this article

Add missing citations, update stale details, or suggest a clearer explanation. Every suggestion is reviewed for sourcing before it goes live.

1 revision by 1 contributors · full history

Suggest edit

What links here

Best AI Image Generators Doubao Seedream Reve Image Seedream

Key facts

What is GPT Image 2?

How does GPT Image 2 differ from gpt-image-1 and DALL-E?

What can GPT Image 2 do?

Where is GPT Image 2 available, and how much does it cost?

How does GPT Image 2 rank on benchmarks?

How was GPT Image 2 received?

References

Improve this article

Related Articles

GPT Image 1

DALL-E 3

DALL-E 2

Art ChatGPT Plugins

DALL-E

DALL-E (Agent)

What links here

Related Articles

GPT Image 1

DALL-E 3

DALL-E 2

Art ChatGPT Plugins

DALL-E

DALL-E (Agent)

What links here