Whisk (Google Labs)
Last reviewed
Jun 3, 2026
Sources
12 citations
Review status
Source-backed
Revision
v1 · 1,379 words
Improve this article
Add missing citations, update stale details, or suggest a clearer explanation.
Last reviewed
Jun 3, 2026
Sources
12 citations
Review status
Source-backed
Revision
v1 · 1,379 words
Add missing citations, update stale details, or suggest a clearer explanation.
Whisk is an experimental generative-AI tool from Google Labs that lets people create images by feeding it other images rather than long written prompts. Launched in the United States on December 16, 2024, Whisk asked users to supply pictures for three roles, a Subject, a Scene, and a Style, and then remixed those inputs into a new image. Behind the scenes it paired Google's Gemini model, which wrote a detailed caption for each uploaded picture, with Imagen 3, which turned those captions into the final image. Google described the result as capturing the "essence" of an input rather than reproducing it exactly. The tool ran as a free Google Labs experiment for about sixteen months before its best features were folded into Google's unified Flow platform on April 30, 2026.[1][2][3]
Google introduced Whisk on December 16, 2024, as part of a broader set of media-generation announcements that also included an upgraded Imagen 3 image model and the Veo 2 video model.[2][4] The launch came roughly ten months after Google had paused the people-generation feature of Gemini's image tool following criticism that it produced historically inaccurate pictures, and reporters noted that Whisk's deliberately loose, "essence"-based approach to likeness sat against that backdrop.[5]
Google positioned Whisk as a creative-exploration toy rather than a precision photo editor. The company framed it for rapid visual brainstorming, suggesting outputs such as a digital plushie, an enamel pin, or a sticker, and emphasized that users could generate many variations quickly instead of crafting one carefully worded text prompt.[1][3] The work came out of Google Labs, the company's incubator for early-stage experiments, and drew on research from Google DeepMind, which builds the Imagen and Gemini model families.[1][2]
Whisk's distinguishing idea is that the prompt is itself made of images. A user uploads or generates up to three reference pictures, each assigned to a category:[1][3][4]
| Input | Role |
|---|---|
| Subject | The main object or character the image should feature |
| Scene | The background, setting, or environment |
| Style | The visual aesthetic or artistic treatment |
The pipeline then runs in two stages. First, Gemini, using its visual-understanding ability, automatically writes a detailed text caption describing each uploaded image. Those captions are then passed to Imagen 3, Google's then-latest text-to-image model, which generates the new composite image from the descriptions.[1][3] Because the system works from a generated description rather than from the pixels of the originals, it reproduces the general character of an input instead of an exact copy. Google warned that a generated subject might differ from its source in traits such as height, weight, hairstyle, or skin tone, and stressed that Whisk "captures your subject's essence, not an exact replica."[1][4]
Users were not locked out of text. Whisk let people view and edit the underlying captions Gemini had written, and add their own prompt text to steer the overall result or refine a single category.[2][4] This made the tool a hybrid: image-first by default, with optional text control for users who wanted it.
One caveat is worth noting on sourcing. TechCrunch's launch-day write-up described Whisk as combining the three inputs using "Imagen 3" and did not mention Gemini's captioning step.[4] Google's own announcement and subsequent reporting, however, consistently describe the two-model Gemini-plus-Imagen 3 pipeline, so the captioning stage is well attested.[1][2][3]
At launch on December 16, 2024, Whisk was available only to users in the United States, who could access it for free at labs.google/whisk.[1][2][4] It was an unrestricted Google Labs experiment rather than a paid product.
On February 11, 2025, Google expanded Whisk to more than 100 additional countries.[6][7] The rollout, however, deliberately excluded several large markets. Reporting at the time named India, Indonesia, the European Union, and the United Kingdom as regions where Whisk remained unavailable, with the EU and UK absences widely attributed to data-protection and AI-regulation considerations.[6][7]
| Milestone | Date | Availability |
|---|---|---|
| Launch | December 16, 2024 | United States only |
| International expansion | February 11, 2025 | 100+ countries (excluding the EU, UK, India, Indonesia) |
| Whisk Animate added | April 15, 2025 | Google One AI Premium subscribers, 60+ countries |
| Moved into Flow | April 30, 2026 | Capabilities migrated to Google Flow |
On April 15, 2025, Google added a feature called Whisk Animate, which extended the tool from still images into short video. Whisk Animate let users turn a Whisk image into an eight-second animated clip generated by Veo 2, Google's video model.[8][9] The capability was announced alongside Veo 2's arrival in the Gemini app, where the same model produced eight-second clips from text prompts.[8]
Unlike the base image tool, Whisk Animate sat behind a paywall. It was offered to Google One AI Premium subscribers (the tier also branded as Gemini Advanced) in more than 60 countries, and subscribers could use it at labs.google/whisk.[8][9] Reporting on the feature described a monthly allotment of generated videos that refreshed each month and did not roll over.[9] As with other Veo 2 output, every clip carried Google's SynthID watermark, an imperceptible signal embedded in each frame to mark the video as AI-generated.[8]
Coverage of Whisk's debut was broadly curious rather than alarmed. Outlets including The Verge, TechCrunch, and CNN framed the tool as a novel twist on image generation, with The Verge characterizing it as an image generator that takes other images as prompts to suggest the subject, scene, and style.[4][5][10] Several writers highlighted its playful, low-friction design, noting that it was pitched as a creative toy for quick inspiration rather than a traditional image editor.[5][3]
The most frequently raised concern was fidelity to people's likenesses. Because the system rebuilds an image from a generated caption, observers pointed out that a person used as a Subject could come back looking noticeably different, and they read Whisk's "essence, not a replica" messaging as both a design choice and a hedge after Google's earlier, troubled rollout of people-generation in Gemini's image tool.[5] Some commentary also flagged broader questions about training data and the rights attached to AI-remixed images.[3]
In March 2026 Google announced that Whisk would be retired as a standalone experiment, with its best capabilities moving directly into Flow, the company's unified platform for AI image and video creation, on April 30, 2026.[11][12] Google told users to migrate their content manually before that date, warning that any media left in a Whisk library afterward would be permanently deleted; AI credits carried over automatically because Whisk and Flow shared the same credits system.[11][12] Because Flow was not available everywhere, users in unsupported regions lost access without a migration path.[11] The move effectively closed Whisk's roughly sixteen-month run as a public Google Labs experiment that ran from December 2024 to April 2026.[12]