AI Test Kitchen

AI Test Kitchen was a Google application for Android, iOS, and the web that let users interact with experimental generative AI models behind a controlled access program. Announced at Google I/O 2022 and first launched on Android in August 2022, the app was Google's primary public-facing surface for early demos of LaMDA and, later, MusicLM. It served as Google's attempt to gather user feedback on emerging AI capabilities in a controlled way before launching them as full products. The mobile apps were delisted in August 2023, after which the experience continued only on the web. AI Test Kitchen was effectively superseded by Bard, launched in March 2023, which was renamed to the Gemini app in February 2024.

AI Test Kitchen mattered less for what it shipped than for what it signalled. It was Google's first big consumer-facing generative AI release after years of internal research, and it tried to do the cautious, drip-feed thing right at the moment OpenAI was about to throw the doors open with ChatGPT. Whether that turned out well for Google is, depending on who you ask, either a textbook case of being lapped by a less careful competitor or an early sign that responsible AI rollouts are harder than they look.

Origin and announcement

Google CEO Sundar Pichai introduced AI Test Kitchen during the Google I/O 2022 keynote on May 11, 2022. The app was unveiled alongside LaMDA 2, the second generation of Google's Language Model for Dialogue Applications, which had been announced as a research project the previous year at I/O 2021. Pichai framed AI Test Kitchen as a way for users to "explore, get a feel for, and provide feedback on" emerging Google AI without the company having to commit those capabilities to a finished product first.

During the keynote, Pichai walked through demos in which a user asked LaMDA 2 to describe deep-sea creatures at the Marianas Trench, and a separate demo where the model broke down the steps for planting a vegetable garden into actionable subtasks. He was careful to set expectations: "These are not products," he said, "they are quick sketches that allow us to explore what models like LaMDA 2 can do."

The initial plan was for the app to roll out over "the coming months" in the US, starting with "select academics, researchers, and policymakers" before opening to a broader waitlist. That cautious phasing was deliberate. LaMDA had already attracted controversy: in June 2022, Google engineer Blake Lemoine publicly claimed the model had become sentient based on its responses about self-identity, religion, and Isaac Asimov's Three Laws of Robotics. Google rejected the claim and fired Lemoine on July 22, 2022. The episode raised LaMDA's profile, but it also raised the bar for how carefully any consumer-facing demo had to be presented.

Launch timeline

Date	Event
May 11, 2022	AI Test Kitchen announced at Google I/O 2022 alongside LaMDA 2
June 11, 2022	Blake Lemoine publicly claims LaMDA is sentient; controversy raises scrutiny
July 22, 2022	Lemoine fired by Google
August 25, 2022	Android app launches in the US with three demos: Imagine It, List It, Talk About It
October 18, 2022	iOS app released in the US App Store
November 2, 2022	"Season 2" preview adds City Dreamer and Wobble, plus expansion to Australia, Canada, Kenya, New Zealand, and the UK
November 30, 2022	OpenAI launches ChatGPT
December 2022	Google reportedly issues internal "Code Red" in response to ChatGPT's growth
January 26, 2023	MusicLM research paper posted to arXiv (2301.11325)
February 6, 2023	Sundar Pichai announces Bard
March 21, 2023	Bard launches publicly with US/UK waitlist
May 10, 2023	Google Labs (labs.google) launches; MusicLM added to AI Test Kitchen as a public demo at Google I/O 2023
August 1, 2023	AI Test Kitchen mobile apps delisted from Google Play and the App Store; service moves to web only
December 6, 2023	Google announces the Gemini model family
February 8, 2024	Bard renamed to the Gemini app

Demos and features

The app rotated experimental demos rather than presenting one stable product. Each demo was tied to a specific research model or capability that Google wanted to test publicly. New demos appeared in batches that the company informally called "seasons."

Demo	Added	Underlying model	What it did
Imagine It	August 2022	LaMDA 2	User describes a place or scenario; the model returns immersive descriptions and follow-up questions about the imagined scene
List It (also "List It Out")	August 2022	LaMDA 2	User states a goal or activity; the model breaks it into a structured list of subtasks, which can be drilled down further
Talk About It (Dogs Edition)	August 2022	LaMDA 2	Open-ended conversation deliberately constrained to a single topic (initially dogs) to test the model's ability to stay on subject
City Dreamer	November 2022 (Season 2 preview)	Imagen-derived text-to-image research	User describes a city; the system generates imagined urban scenes
Wobble	November 2022 (Season 2 preview)	Text-to-image plus 2D-to-3D animation	User imagines a monster, which is then animated to dance using a research animation pipeline
MusicLM	May 10, 2023	MusicLM	Text-to-music generation; users describe a musical idea and receive two generated clips, then vote for the preferred one to provide preference data

The LaMDA-based demos shared a common philosophy. They were narrow, intentionally so, and they routed user output through safety filters that flagged sexually explicit, hateful, violent, illegal, and privacy-violating content. Google noted that response ratings ("nice, offensive, off topic, or untrue") were collected without being linked to user Google accounts, so that the company could refine the model without building a personalised profile of each tester.

The MusicLM demo was the most popular addition by some distance. Users typed a prompt such as "soulful jazz for a dinner party" or "industrial techno that is hypnotic" and received two generated audio clips. The interface borrowed a preference-collection pattern familiar from RLHF training: pick the better track. Google trained MusicLM on a corpus that excluded vocals to sidestep the hardest copyright issues, and the company published the MusicCaps dataset (5.5k expert-annotated clips) alongside the research.

Underlying models

AI Test Kitchen was a thin shell over a small set of research models. Knowing which model is doing the work in any given demo helps explain why the demos felt so different from each other.

LaMDA and LaMDA 2. LaMDA (Language Model for Dialogue Applications) was first announced in 2021 and described in detail in a January 2022 paper by Thoppilan and colleagues. It was a transformer-based model trained on dialogue and stories, with safety and groundedness tuning layered on top. LaMDA 2 was a refresh announced at I/O 2022. The Imagine It, List It, and Talk About It demos all ran on LaMDA 2.

Imagen. Imagen was Google Research's text-to-image diffusion model, originally introduced in May 2022. AI Test Kitchen's Season 2 demos (City Dreamer and Wobble) drew on Imagen-style image generation, though Google never gave testers an open Imagen prompt box, the way Stable Diffusion or DALL-E 2 did at the time. The constraint was deliberate. Google was nervous about uncontrolled image generation, particularly around faces and copyrighted styles, and the structured demo format was the workaround.

MusicLM. The MusicLM model was described in an arXiv paper (2301.11325) by Andrea Agostinelli, Timo Denk and colleagues, posted on January 26, 2023. It cast music generation as a hierarchical sequence-to-sequence task and produced 24 kHz audio that remained coherent over several minutes. When MusicLM appeared in AI Test Kitchen on May 10, 2023, it generated short clips with no copyrighted material in the training data and no support for vocals, both safety choices made before public release.

PaLM and successors. Some later AI Test Kitchen experiments drew on PaLM and PaLM 2 rather than LaMDA. By the time Bard was upgraded to PaLM in March 2023, it was clear LaMDA was being deprecated in favour of the PaLM family, which would in turn give way to Gemini in late 2023.

Purpose and approach

Google's framing for AI Test Kitchen was "responsible disclosure." The pitch ran something like this: generative AI models were powerful but unreliable, they sometimes produced offensive or false text, and shipping them as polished consumer products was premature. A controlled access app, with a waitlist and a feedback loop, would let Google gather real-world data, stress-test safety filters, and watch for failure modes before any wider release.

In practice, this meant a few specific design choices. Demos were narrow. Each one was scoped to a single task or a single topic, which kept failure modes predictable. Output ratings were collected at the response level. Safety filters caught a defined set of categories. Access required a Google account and agreement to a terms-of-service screen that explicitly warned about possible offensive output. The app was free and US-only at first, with selective international expansion in November 2022.

The contrast with OpenAI's ChatGPT release on November 30, 2022, was immediate and consequential. ChatGPT had no waitlist, no narrow demos, no rotating seasons. You signed in and you typed. Within five days it had a million users; within two months it had a hundred million. Google's careful staging suddenly looked like timidity, and AI Test Kitchen, which had been positioned as bold for its time, looked underpowered as a response.

Access and availability

During its lifetime, AI Test Kitchen was free to use but invite-only. Users registered interest on the AI Test Kitchen website, and Google rolled out access in waves. By November 2022, the app was available on Android and iOS in Australia, Canada, Kenya, New Zealand, the United Kingdom, and the United States, all in English. New testers had to read and acknowledge guidelines about what the model could and could not do, including disclaimers that responses might be inaccurate or offensive.

After the August 1, 2023 delisting of the mobile apps, the only way to use AI Test Kitchen was at aitestkitchen.withgoogle.com. By then, MusicLM was effectively the only active demo, and the LaMDA-based demos had been retired in favour of Bard, which was a far more capable open-ended chatbot built on the same family of models.

Reception

The initial coverage of AI Test Kitchen was cautiously positive. Tech press at TechCrunch, Engadget, The Verge, and 9to5Google framed it as a sensible, transparent way to let outsiders touch a real LLM without pretending it was a product. Some critics in early reviews noted the demos felt thin: List It produced fairly generic checklists, Talk About It refused to leave its assigned topic in ways that felt brittle, and Imagine It was charming but limited.

The sentiment changed quickly after ChatGPT. Once a free, general-purpose chatbot was available to anyone with an email address, AI Test Kitchen looked stale. The New York Times and other outlets reported that Google had declared an internal "Code Red" in December 2022, with founders Larry Page and Sergey Brin reportedly returning to executive meetings to discuss Google's response. Pichai later denied issuing a Code Red himself in a 2023 interview, but the company's pivot was unmistakable: by February 6, 2023, Bard was announced, and by March 21 it was in users' hands. AI Test Kitchen, which had been the flagship a few months earlier, was suddenly a sideshow.

The Bard and Gemini transition

Bard absorbed almost all of the consumer attention that AI Test Kitchen had been built to capture. It was a more familiar chat interface, it was not narrowly task-scoped, and it was tied directly to Google Search and Workspace integrations as those rolled out. AI Test Kitchen kept running in parallel, but the team's energy and the model investment moved to Bard.

The rebranding to the Gemini app on February 8, 2024 marked the formal end of Bard as a separate brand. With the Gemini family of models (announced December 6, 2023) underpinning the consumer chatbot, Google no longer needed a separate experimental venue for LaMDA-era demos. New experimental work went either into Google Labs (launched May 10, 2023 at labs.google), into Google AI Studio for developers, or directly into Gemini's own experimental features tab.

MusicLM in AI Test Kitchen did get a successor of sorts. In labs.google, Google launched MusicFX in early 2024, a more polished text-to-music tool with longer outputs, more controls, and SynthID watermarking. The underlying research line evolved into Lyria, a model later integrated into YouTube Shorts and other Google products, with Lyria 2 following in 2024-2025.

Lasting impact

It is easy to dismiss AI Test Kitchen as a footnote, the cautious thing Google did just before everyone else stopped being cautious. That underrates it. A few specific things came out of it that mattered.

The waitlist plus rotating-demo model became a template. Anthropic's Claude had a waitlist, OpenAI ran private previews for GPT-4 vision and Sora, and almost every major lab today gates new modalities behind some version of "register your interest." That pattern was not invented at AI Test Kitchen, but the app helped normalise it for consumer-facing generative AI in 2022.

MusicLM became Lyria, which became a real product line. The clips collected from AI Test Kitchen testers were among the first large samples of human preference data on text-to-music output, and that work fed directly into Google DeepMind's later commercial music generation tools.

The LaMDA demos quietly proved out feedback infrastructure that Bard, and then Gemini, would lean on heavily. The thumbs-up and thumbs-down buttons on Gemini today are direct descendants of the "nice, offensive, off topic, or untrue" rating buttons in AI Test Kitchen.

And then there is the harder lesson. Internally at Google, AI Test Kitchen was used during 2022 and 2023 as evidence that the company could be both bold and careful. After ChatGPT, the company's executives concluded that careful was not enough, and they moved fast. AI Test Kitchen sits awkwardly in that arc. It is what Google chose to do when it had time, and it is what Google stopped doing when it ran out of time.

Comparison with similar preview platforms

AI Test Kitchen sat in a small but growing category of vendor-run experimental venues for generative AI. The table below compares it with adjacent products from Google and competitors.

Platform	Vendor	Year launched	Access model	Notable features	Status (2026)
AI Test Kitchen	Google	2022	Waitlist, free, mobile then web only	Rotating demos for LaMDA, Imagen, MusicLM	Mobile apps delisted Aug 2023; web reduced to MusicLM, largely superseded
Bard / Gemini app	Google	2023	Open access (waitlist briefly)	General chatbot, Workspace and Search integrations	Active, primary Google consumer AI surface
ChatGPT	OpenAI	November 2022	Open access, free tier and paid	General chatbot, plugins, Code Interpreter, GPTs	Active, dominant consumer LLM product
Claude (claude.ai)	Anthropic	March 2023	Open access, free tier and paid	General chatbot, long context, Artifacts, Projects	Active
Google Labs (labs.google)	Google	May 2023	Open access, mostly free	Umbrella for NotebookLM, ImageFX, MusicFX, VideoFX, Whisk	Active
Google AI Studio	Google	2023	Free with Google account, developer focus	Direct API access to Gemini, prompt design tools	Active
Microsoft Copilot Labs	Microsoft	2023	Open access	Experimental Copilot features	Active
Meta AI Studio	Meta	2024	Open access	Build custom AI characters	Active

Google Labs, in particular, picked up most of what AI Test Kitchen was supposed to do. Where AI Test Kitchen had to manage a waitlist, gate by region, and ship updates through three platforms, Labs was open by default, web only, and easier to iterate. NotebookLM, ImageFX, MusicFX, VideoFX, and Whisk all live there now. The lineage from AI Test Kitchen's MusicLM demo to MusicFX to Lyria 2 is the cleanest example of the original idea actually working: ship a research demo, gather feedback, productise.

Successor experimental venues

Google did not replace AI Test Kitchen with a single thing. Instead, the work split across several surfaces, each aimed at a different audience.

Google Labs at labs.google is the closest direct successor for consumer-facing experiments. It hosts MusicFX, ImageFX, VideoFX, NotebookLM, Whisk, and a rotating set of others. Google AI Studio is the developer-facing equivalent, with direct prompt-level access to the Gemini model family including Gemini 3 Pro and on-device variants like Gemini Nano. The Gemini app itself ships experimental features behind a Labs toggle for paid users. And specialised research efforts, such as Gemini Robotics, have their own previews.

None of these inherited the original AI Test Kitchen brand, and that is probably deliberate. By 2024, Google had a clearer story to tell about Gemini as a product family, and a separate "experimental playground" name would have muddied it.

Cultural context

AI Test Kitchen launched into one of the most peculiar moments in the public conversation about AI. The Blake Lemoine LaMDA sentience controversy broke in June 2022, just weeks after the app was announced. Lemoine published transcripts in which LaMDA appeared to discuss its own feelings, and the Washington Post ran a profile that put the question of machine consciousness on front pages worldwide. Most AI researchers rejected the sentience claim outright. Yann LeCun, Meta's chief AI scientist, said the relevant neural networks were nowhere near capable of true intelligence.

The controversy nonetheless coloured how AI Test Kitchen was perceived. Reviewers came to the demos already primed to ask whether the model was "alive," and Google had to repeatedly clarify that LaMDA was a language model that produced statistically likely text, not a being. The careful, scoped demos in the app were partly a response to exactly that perception risk.

In parallel, the broader debate about AI safety was shifting. The Center for AI Safety statement and similar high-profile letters about existential risk were still months away in mid-2022, but the foundations were being laid. AI Test Kitchen represented one model for handling the tension: gate access, gather feedback, do not ship until you are confident. ChatGPT represented the opposite. The industry has spent the years since arguing about which approach is better, and the honest answer is that neither side fully won.

References

Google Blog. "Join us in the AI Test Kitchen." August 25, 2022. https://blog.google/technology/ai/join-us-in-the-ai-test-kitchen/
9to5Google. "AI Test Kitchen puts the power of Google's language processing in your hands." May 11, 2022. https://9to5google.com/2022/05/11/ai-test-kitchen-language-processing-app/
TechCrunch. "Google details its latest language model and AI Test Kitchen, a showcase for AI research." May 11, 2022. https://techcrunch.com/2022/05/11/google-details-its-latest-language-model-and-ai-test-kitchen-a-showcase-for-ai-research/
TechCrunch. "Google's new app lets you test experimental AI systems like LaMDA." August 25, 2022. https://techcrunch.com/2022/08/25/googles-new-app-lets-you-experimental-ai-systems-like-lamda/
9to5Google. "Google opens AI Test Kitchen waitlist registration as gradual US rollout starts." August 25, 2022. https://9to5google.com/2022/08/25/google-ai-test-kitchen/
Engadget. "Google's AI Test Kitchen lets you experiment with its natural language model." May 11, 2022. https://www.engadget.com/google-ai-test-kitchen-lamda-2-experiments-180212404.html
9to5Google. "AI Test Kitchen adding text-to-image demos as Google releases short stories co-written with AI." November 2, 2022. https://9to5google.com/2022/11/02/ai-test-kitchen-season-2/
Google Blog. "How to try MusicLM from Google's AI Test Kitchen." May 10, 2023. https://blog.google/technology/ai/musiclm-google-ai-test-kitchen/
TechCrunch. "Google makes its text-to-music AI public." May 10, 2023. https://techcrunch.com/2023/05/10/google-makes-its-text-to-music-ai-public/
9to5Google. "Google delists AI Test Kitchen app on Android and iOS." August 1, 2023. https://9to5google.com/2023/08/01/google-delists-ai-test-kitchen-app/
TechCrunch. "Google pulls its AI Test Kitchen app from Play Store and App Store." August 2, 2023. https://techcrunch.com/2023/08/02/google-pulls-its-ai-test-kitchen-app-from-play-store-and-app-store/
Agostinelli, A., Denk, T. I., Borsos, Z., Engel, J., Verzetti, M., Caillon, A., Huang, Q., Jansen, A., Roberts, A., Tagliasacchi, M., Sharifi, M., Zeghidour, N., and Frank, C. "MusicLM: Generating Music From Text." arXiv:2301.11325, January 26, 2023. https://arxiv.org/abs/2301.11325
Wikipedia. "LaMDA." https://en.wikipedia.org/wiki/LaMDA
Wikipedia. "Gemini (chatbot)." https://en.wikipedia.org/wiki/Google_Bard
Voicebot.ai. "Google Launches AI Test Kitchen for LaMDA Conversational AI." August 29, 2022. https://voicebot.ai/2022/08/29/google-launches-ai-test-kitchen-for-lamda-conversational-ai/
MIT Technology Review. "Google just launched Bard, its answer to ChatGPT, and it wants you to make it better." March 21, 2023. https://www.technologyreview.com/2023/03/21/1070111/google-bard-chatgpt-openai-microsoft-bing-search/
Google Blog. "Try ImageFX and MusicFX, our newest generative AI tools in Labs." 2024. https://blog.google/technology/ai/google-labs-imagefx-textfx-generative-ai/

AI Test Kitchen

AI Test Kitchen

Origin and announcement

Launch timeline

Demos and features

Underlying models

Purpose and approach

Access and availability

Reception

The Bard and Gemini transition

Lasting impact

Comparison with similar preview platforms

Successor experimental venues

Cultural context

See also

References

Improve this article

AI Test Kitchen

Origin and announcement

Launch timeline

Demos and features

Underlying models

Purpose and approach

Access and availability

Reception

The Bard and Gemini transition

Lasting impact

Comparison with similar preview platforms

Successor experimental venues

Cultural context

See also

References

AI Test Kitchen

Origin and announcement

Launch timeline

Demos and features

Underlying models

Purpose and approach

Access and availability

Reception

The Bard and Gemini transition

Lasting impact

Comparison with similar preview platforms

Successor experimental venues

Cultural context

See also

References

Improve this article

Related Articles

Google AI Studio

Access PDF

Dev tools

Aider

NotebookLM

Imagen 3

AI Test Kitchen

Origin and announcement

Launch timeline

Demos and features

Underlying models

Purpose and approach

Access and availability

Reception

The Bard and Gemini transition

Lasting impact

Comparison with similar preview platforms

Successor experimental venues

Cultural context

See also

References

Related Articles

Google AI Studio

Access PDF

Dev tools

Aider

NotebookLM

Imagen 3