Nano Banana
Last reviewed
May 16, 2026
Sources
17 citations
Review status
Source-backed
Revision
v1 ยท 3,648 words
Improve this article
Add missing citations, update stale details, or suggest a clearer explanation.
Last reviewed
May 16, 2026
Sources
17 citations
Review status
Source-backed
Revision
v1 ยท 3,648 words
Add missing citations, update stale details, or suggest a clearer explanation.
Nano Banana is the popular nickname for Google's native image generation and editing model family, developed by Google DeepMind and integrated into the Gemini ecosystem. The name first appeared in August 2025, when an anonymous model labeled nano-banana started winning blind comparisons on the LMArena leaderboard for image editing. Google revealed the model's true identity on August 26, 2025, when it launched Gemini 2.5 Flash Image, the production version of the system that had been generating buzz in the community for roughly two weeks. The nickname stuck and Google eventually adopted it as the official branding for the consumer product line.
On November 20, 2025, Google released an upgraded successor branded as Nano Banana Pro, built on the larger Gemini 3 Pro reasoning model. The Pro variant added higher-resolution output (up to 4K), improved multilingual text rendering, support for blending up to 14 reference images into a single composition, and tighter character consistency across up to five subjects. Both models include invisible SynthID watermarks on every output, and the Pro model adds a visible Gemini sparkle watermark on free and mid-tier outputs.
The Nano Banana family is positioned as Google's direct competitor to GPT Image 1, FLUX.2, and Ideogram 3.0 in the native multimodal image generation space, and it sits alongside Google's standalone text-to-image model Imagen 4 in the company's overall image stack. While Imagen specializes in pure generation tasks, Nano Banana is built into the Gemini conversational interface and is optimized for iterative editing, character consistency, and image-grounded reasoning.
| Attribute | Nano Banana (original) | Nano Banana Pro |
|---|---|---|
| Official model name | Gemini 2.5 Flash Image | Gemini 3 Pro Image |
| Developer | Google DeepMind | Google DeepMind |
| Public release | August 26, 2025 | November 20, 2025 |
| LMArena debut | August 12, 2025 (anonymous) | Not applicable |
| Underlying foundation | Gemini 2.5 Flash | Gemini 3 Pro |
| Maximum resolution | Standard (about 1K) | 1K, 2K, or 4K |
| Character consistency | Yes | Up to 5 characters |
| Multi-image input | Yes | Up to 14 objects per composition |
| Watermarking | SynthID (invisible) | SynthID plus visible Gemini sparkle on free and Pro tiers |
| Available aspect ratios | 1:1, 16:9, 9:16, others | 16:9, 4:3, 5:3, 1.85:1, 2.39:1, 2.75:1, 4:1, 9:16, 1:1 |
| API price per image (standard) | $0.039 | $0.134 (lower resolutions) |
| API price per image (4K) | Not applicable | $0.24 |
| API access | Gemini API, Google AI Studio, Vertex AI | Gemini API, Google AI Studio, Vertex AI |
Google had been building image generation capability into the Gemini product line throughout 2024 and early 2025, but the work was split across separate components. The Gemini app could call out to Imagen for text-to-image requests, and Gemini's vision tower could describe or analyze input images, but the two paths were not unified. That meant editing a generated image often required regenerating the whole picture from a new prompt, and the model could not reliably keep a character looking the same across multiple turns of a conversation.
In parallel, OpenAI shipped GPT Image 1 inside ChatGPT on March 25, 2025. The launch triggered the Studio Ghibli style transfer wave that swept social media for several weeks and pushed OpenAI's infrastructure to its limits. GPT Image 1's selling point was native multimodality: the same transformer that handled text also generated and edited images, which meant edits preserved context far better than pipeline-based systems. The bar for image generation shifted almost overnight, and Google's response had to clear it.
On or around August 12, 2025, a new entrant called nano-banana showed up in LMArena's Image Edit Arena. LMArena runs blind A/B tests where users vote on which of two anonymized models produced a better result for a given prompt. The new model started winning at an unusual rate, particularly on edits that required preserving the identity of a subject, like changing the background behind a person without altering their face or pose.
Within two weeks, the anonymous model attracted more than 5 million total votes on LMArena, a record at the time. Traffic to the arena reportedly increased tenfold during that window, and the site's CTO said monthly active users crossed 3 million. Speculation about the model's origin spread quickly. Theories pointed to Google, Black Forest Labs, ByteDance, and several other groups, but no one confirmed anything until Google itself broke the silence.
The nickname has an unusual origin. According to people involved with the project, the codename traces back to nicknames given internally to Naina Raisinghani, a product manager at Google DeepMind who worked on the model. The team kept the codename for the public preview, and when the buzz outgrew the testing context, Google leaned into the branding rather than trying to replace it.
On August 26, 2025, Google publicly identified nano-banana as Gemini 2.5 Flash Image. The model launched the same day across the Gemini app, the Gemini API, Google AI Studio, and Vertex AI. LMArena posted the reveal on X with a banana emoji and confirmed that the anonymous model had been Gemini-2.5-Flash-Image-Preview by Google DeepMind.
Gemini 2.5 Flash Image was priced at $30 per million output tokens, with each generated image counting as 1,290 tokens. That worked out to roughly $0.039 per image at the API level, which was cheap enough for high-volume use cases like e-commerce catalog generation, social media variants, and product mockups. Developers could call the model through the Gemini API directly or through Vertex AI for enterprise integrations, and OpenRouter and fal.ai added it to their hosting catalogs within days.
The technical pitch for Gemini 2.5 Flash Image had four core features. It supported character consistency across turns, so the same person or object could appear in a new pose, outfit, or setting without losing their identity. It blended multiple input images into a single composition through prompts. It accepted targeted edit instructions in natural language, like "remove the trash can on the left" or "change the lighting to golden hour." And it inherited Gemini's world knowledge, which let it reason about what was actually in an image when deciding how to modify it.
Google announced Gemini 3 Pro on November 18, 2025, as the next generation of its flagship reasoning model. Two days later, on November 20, the company released Nano Banana Pro, an image generation and editing system built directly on Gemini 3 Pro rather than on the lighter Flash model. The Pro version inherited the same naming convention as the original but routed image requests through a model with substantially more capacity for reasoning, planning, and language understanding.
The headline upgrade was resolution. Nano Banana Pro produces output at 1K, 2K, or 4K, where the original was limited to roughly 1K. The model also supports a wider catalog of aspect ratios, including 16:9, 4:3, 5:3, 1.85:1, 2.39:1, 2.75:1, 4:1, 9:16, and 1:1, which covers most common formats for film, advertising, social media, and print.
Character consistency expanded from a single subject focus to maintaining up to five distinct people across a scene, with each able to appear from different angles and distances while remaining recognizable. Composition capacity grew to 14 input images per workflow, which allows users to assemble complex scenes by referencing many separate source pictures. Google demonstrated this on launch with infographics that combined typography, photographic reference, and diagrammatic layout into a single output.
Text rendering improved significantly. Google claimed that Nano Banana Pro hit the lowest error rate of any model it benchmarked for single-line text rendering, mostly under 10 percent across multiple languages. The model can render long captions, paragraphs, calligraphy, and varied typography directly inside generated images, which makes it usable for posters, social media graphics, and translation-localized creative assets without a separate text overlay pass.
The most important architectural change was that Nano Banana Pro inherits Gemini 3 Pro's reasoning capacity and its access to Google Search. That means a prompt like "create a chart of the top ten countries by population in 2025" can pull current data through Search, structure it, and render the chart with correct numbers and country names. Earlier image models could produce something that looked like a chart but contained hallucinated values. Nano Banana Pro can actually ground the content in retrievable facts.
This also extends to infographics and explainer diagrams, where the model can reason about the subject before composing the layout. Google highlighted this as a contrast with diffusion-based competitors that treat the entire image as a single denoising problem rather than as a structured composition.
| Feature | Nano Banana (original) | Nano Banana Pro |
|---|---|---|
| Text-to-image | Yes | Yes |
| Image-to-image editing | Yes | Yes |
| Natural-language local edits | Yes | Yes, with finer control |
| Character consistency | Single subject | Up to 5 subjects |
| Multi-image fusion | Yes | Up to 14 input images |
| Camera angle and lighting controls | Limited | Wide-angle, panoramic, close-up, depth of field, day-to-night |
| Color grading | Basic | Full color palette manipulation |
| Native text rendering | Limited | Long captions, multiple languages, varied typography |
| Search-grounded generation | No | Yes, via Gemini 3 Pro and Google Search |
| Maximum resolution | About 1K | 4K |
| Aspect ratios | 1:1, 16:9, 9:16, several others | 9 ratios including cinematic formats |
| Watermark | SynthID (invisible) | SynthID plus visible Gemini sparkle on free and Pro tiers (removed on Ultra) |
Both models accept text prompts and reference images as input. Editing is conversational, so a user can ask for one change, then another, then another, and the model carries the context forward without losing the subject. Targeted edits include object removal, pose change, background swap, lighting shift, color grade, and style transfer. The Pro model's lighting controls extend to specific directional sources, day-to-night transitions, and depth of field adjustments that approximate real camera settings.
The documentation acknowledges some limitations. Masked editing, major lighting changes, and complex multi-image blends can occasionally produce artifacts, particularly when many constraints stack on a single prompt. The model tends to perform best when given a clear primary subject and one or two reference inputs rather than 14.
| Surface | Original Nano Banana | Nano Banana Pro |
|---|---|---|
| Gemini app (free) | Yes, with daily quota | Limited free quota |
| Google AI Plus, Pro, Ultra | Yes, higher quotas | Yes, higher quotas |
| Search AI Mode | Yes (Create Images) | Yes ("Create Images Pro" with Thinking 3 Pro) |
| NotebookLM | Yes | Yes, globally for all users |
| Workspace (Slides, Vids) | Limited | Yes, with "Help me visualize" and "Beautify this slide" |
| Flow (filmmaking) | Limited | All paid plans |
| Mixboard | No | Yes |
| Google Ads | Yes | Yes, globally |
| Gemini API | $0.039 per image | $0.134 to $0.24 per image |
| Vertex AI | Yes | Yes, in paid preview |
| Google AI Studio | Free testing | Free testing |
| Antigravity, Firebase, Stitch | Yes | Yes |
For the API, Nano Banana Pro pricing operates on three components: text input tokens at $2 per million for prompts under 200K context, thinking output tokens at $12 per million, and image generation tokens that work out to roughly $0.134 per image at standard resolution and $0.24 per image at 4K. Batch API processing offers a 50 percent discount, which brings the standard-resolution cost down to roughly $0.067 per image. The original Nano Banana stays available alongside the Pro model for fast and inexpensive editing use cases where the higher resolution and reasoning capacity are not needed.
Consumer tier handling differs from the API. Free Gemini users get a daily image quota that uses the original Nano Banana before falling back to limits on the Pro model. Google AI Plus and Pro subscribers get larger quotas on both models. Google AI Ultra subscribers get the highest quotas and, importantly, get the visible Gemini sparkle watermark removed from their output images, leaving only the invisible SynthID signal.
For enterprise, the model is available through Vertex AI with the same SynthID watermarking and additional features for content moderation, identity verification, and audit logging. Google Cloud announced enterprise availability for Nano Banana Pro alongside the consumer launch in November 2025.
Every image generated by either Nano Banana model carries an invisible SynthID watermark, the digital provenance signal developed by Google DeepMind. SynthID embeds a pattern in the pixel data that survives many common transformations like cropping, color correction, and JPEG compression, and can be detected by a verifier even if the watermark itself is not visible to a human.
The Gemini app includes a SynthID verifier that lets users upload an image and check whether it was generated by a Google model. Google has positioned SynthID as part of its broader response to concerns about misinformation, deepfakes, and synthetic media, and it is now standard across Google's generative output, including Imagen 4, Veo video generation, and the Gemini text-to-music tools.
For Nano Banana Pro, the watermarking has two layers. The invisible SynthID signal is on every output regardless of subscription tier. The visible Gemini sparkle watermark appears on outputs from free and Google AI Pro accounts, but Ultra subscribers can opt out of the visible mark. The invisible SynthID layer is never removed.
This tiered approach has drawn some criticism. Removing the visible watermark on a paid tier makes it harder for casual viewers to tell a generated image apart from a real one at a glance, even if SynthID detection is still possible with the verifier. Critics have argued that the visible mark should be mandatory across all tiers for that reason. Google's position is that professional users producing client work need the option to deliver clean images, while the invisible SynthID layer preserves the underlying provenance signal.
| Model | Developer | Type | Resolution | Text rendering | Watermark | API price per image |
|---|---|---|---|---|---|---|
| Nano Banana Pro | Google DeepMind | Native multimodal, reasoning-grounded | Up to 4K | Strong, multilingual | SynthID plus visible sparkle | $0.134 to $0.24 |
| Nano Banana (original) | Google DeepMind | Native multimodal | About 1K | Limited | SynthID | $0.039 |
| GPT Image 1 | OpenAI | Native multimodal | Up to 1536x1024 | Good | C2PA Content Credentials | $0.04 to $0.19 |
| FLUX.2 | Black Forest Labs | Diffusion transformer | Up to 4K | Strong | Optional | Open weights and hosted tiers |
| Ideogram 3.0 | Ideogram | Diffusion | Up to 2K | Strongest in class on typography | Optional | Subscription tiers |
| Imagen 4 | Google DeepMind | Text-to-image | Up to 2K | Strong | SynthID | API pricing |
Direct comparisons published in late 2025 generally placed Nano Banana Pro at or near the top of the consumer-friendly category, especially for edits that required reasoning, text inside the image, or multi-image composition. The Verge noted in its coverage that Nano Banana Pro's ability to render legible text directly in the image makes it suitable for generating posters or invitations in multiple languages without a separate typography pass. PCWorld tested the model on an architecture diagram and found that the captions and structural layout were accurate, with Gemini's thinking mode flagging that the result was remarkably faithful to the prompt.
WIRED's hands-on review found rougher edges. A test of a "shirtless skier" prompt produced a body that looked like a fitness model with the user's face placed on top, suggesting the model still has trouble with certain identity-preserving photo edits when the request implies a strong style template. WIRED also noted that the model can mislabel objects in busy scenes, which is a recurring issue across all current image models.
Against GPT Image 1, Nano Banana 2.5 Flash Image was widely seen as more reliable at character consistency. GPT Image 1 launched first, in March 2025, and its viral moment was driven by stylistic creativity rather than identity preservation. When OpenAI's model was used to edit a specific person's portrait into a new setting, the face would often drift across iterations. Nano Banana made that the headline feature, and the Pro version made it stronger by extending the consistency guarantee across multiple subjects.
FLUX.2 from Black Forest Labs remains a benchmark competitor with strong text rendering and an open-weights option that Nano Banana does not match. For users who need to run image generation on their own infrastructure or fine-tune the model on a private dataset, FLUX is generally the practical choice. Nano Banana is closed-source and is only available through Google's hosted endpoints.
Ideogram 3.0 remains the in-class leader for pure typography work, particularly stylized text, logos, and complex layouts that involve text as a primary design element. Nano Banana Pro narrowed that gap significantly with its long-paragraph rendering and multilingual support, but Ideogram still tends to win on typography-heavy creative briefs.
The initial LMArena buzz in August 2025 was the most extreme reaction any image model had received on that platform up to that point. Five million votes in two weeks, a tenfold traffic increase, and 3 million monthly active users for an arena that had been a niche research tool meant that the broader AI community noticed the model before Google had even confirmed its existence. By the time the reveal happened, the brand was already locked in.
The reveal-day coverage was generally positive. TechCrunch focused on the character consistency angle and the obvious framing as a direct counterpunch to GPT Image 1. Coverage in the developer press emphasized the price, which at $0.039 per image was meaningfully lower than competitors and made high-volume use cases viable.
Reception of Nano Banana Pro was more mixed. The headline capabilities were impressive: 4K output, multilingual long-text rendering, 14-image composition, and reasoning-grounded generation. The Verge ran a piece titled "Google's Nano Banana Pro generates excellent conspiracy fuel" that pointed at the obvious risk of a model that can produce convincing fake infographics, fake news layouts, and fake document scans. Critics worried that the combination of legible text inside images plus Search-grounded reasoning created a tool well-suited for misinformation, regardless of the SynthID watermarking.
Other reviewers noted that the upgrade in quality was real and visible. PCWorld, eesel AI, and Cybernews all ran comparative tests against the original Nano Banana, GPT Image 1, and FLUX.2, and generally found that Nano Banana Pro produced the cleanest results for text-heavy generation tasks and for edits that required preserving multiple subjects.
Within the broader Google ecosystem, the integration was the more important story. Nano Banana Pro shipped simultaneously across the Gemini app, NotebookLM, Workspace (Slides and Vids), Search AI Mode, Mixboard, Flow, Google Ads, and the developer surfaces (Gemini API, Vertex AI, AI Studio, Antigravity, Firebase, Stitch). That depth of distribution meant the model reached millions of users within days rather than waiting for individual product teams to wire it in over months. Adobe also announced an integration with Firefly and Photoshop on November 20, 2025, which gave Nano Banana Pro a direct path into professional creative workflows outside of Google's own surfaces.
The original Nano Banana stayed available alongside the Pro version for cheap and fast edits. That dual-track approach mirrors how Google handles its Gemini text models, where Flash and Pro variants coexist with different cost and capability trade-offs. For most consumer cases, the original Nano Banana is the daily driver; Pro is reserved for higher-stakes work that needs the resolution, the text, or the reasoning.