Project Astra

AI Agents Google DeepMind Multimodal AI

9 min read

Updated Jun 28, 2026

Suggest edit History Talk

RawGraph

Last edited

Jun 28, 2026

Fact-checked

In review queue

Sources

11 citations

Revision

v2 · 1,749 words

Fact-checks are independent of edits: a reviewer re-verifies the article against its sources and stamps the date. How we verify

Project Astra is a research prototype from Google DeepMind that explores what a universal AI assistant might look like: a single agent that can see and hear the world in real time through a device camera and microphone, hold a natural low-latency spoken conversation, remember what it has just encountered, and take actions on a user's behalf. It is built on Google's Gemini models and was first shown at Google I/O in May 2024. Rather than shipping as a standalone app, Project Astra serves as a testbed whose capabilities feed into Google products such as Gemini Live and prototype Android XR glasses, and it remains largely a research prototype.^[1]^[2]

What is Project Astra?

Project Astra is Google DeepMind's research effort toward a universal AI assistant, meaning one agent that perceives its surroundings continuously and reasons about them the way a person does, instead of answering isolated text prompts. By early 2024, Google had consolidated much of its AI research under Google DeepMind and was racing to turn its Gemini family of multimodal models into consumer products. The company framed Project Astra as the next step in that effort. DeepMind chief executive Demis Hassabis set out the long-term ambition plainly: "We've always wanted to build a universal agent that will be useful in everyday life. Imagine agents that can see and hear what we do, better understand the context we're in and respond quickly in conversation, making the pace and quality of interactions feel much more natural."^[1] TechCrunch characterized the project as "a new initiative within DeepMind to create AI-powered apps and 'agents' for real-time, multimodal understanding."^[1]

The timing was pointed. OpenAI had unveiled its multimodal GPT-4o model a day before Google I/O, demonstrating a fast, spoken, camera-aware assistant of its own, so Project Astra arrived as a direct answer to that announcement.^[3]

What was shown at the I/O 2024 demonstration?

Hassabis introduced Project Astra during the Google I/O keynote on May 14, 2024.^[1]^[3] The reveal centered on a single continuous video clip, which Google said was recorded in one take. In it, a person walks around an office holding up a phone with the camera running and talks to the assistant without pause.^[3]^[4]

Across the demo Astra identifies objects the camera passes over and reasons about them on the fly. Asked what part of a setup makes sound, it picks out a speaker; when the user circles the top of the speaker and asks what that component is, it answers that the part is a tweeter. It reads and explains a snippet of code shown on a monitor, identifies a neighborhood from the view out a window, and composes a band name for a toy. The moment most often quoted came when the user asked where she had left her glasses: Astra recalled having seen them earlier on a desk, even though they were no longer in view, illustrating a short rolling visual memory of what the camera had recently captured. The clip then continued through a pair of prototype smart glasses, suggesting how the same assistant might work in a wearable form factor.^[3]^[4]

Two themes ran through the presentation. The first was low latency. DeepMind said it had built prototype agents that process information faster by continuously encoding video frames, combining the video and speech input into a timeline of events, and caching that information for efficient recall, so the assistant could respond in conversation without an awkward delay.^[1] The second was that this was research, not a product launch. Google labeled Astra a prototype and said some of its capabilities would begin reaching Gemini products later in the year.^[1]^[2]

How does Project Astra work?

Project Astra is built around a few interlocking ideas that distinguish it from a conventional chatbot.

Capability	What it does
Live multimodal input	Continuously processes streaming camera video and audio together rather than single images or one-off prompts.^[1]
Low-latency speech	Responds in spoken conversation at roughly the cadence of human dialogue, with the ability to be interrupted.^[1]^[5]
Memory	Maintains a rolling memory of recent events and, in later versions, longer in-session recall and some memory across past conversations.^[4]^[5]
Tool use	Calls on services such as Google Search, Lens and Maps to ground its answers.^[5]
Agentic action	In its 2025 form, can carry out multi-step tasks, navigate interfaces and control a device on the user's behalf.^[6]^[7]

The underlying intelligence comes from Google's Gemini models. At unveiling, DeepMind principal scientist Oriol Vinyals described Astra as "a real-time voice interface" with "extremely powerful multimodal capabilities combined with long context," drawing on Gemini 1.5 Pro's large context window.^[1] In December 2024 Astra was rebuilt on Gemini 2.0 Flash, Google's model family for what it called the agentic era.^[5]

How does Project Astra reach Google products?

Project Astra was conceived less as something users would download than as a research engine whose features migrate into Google's existing apps once they are ready. Google has repeatedly described its consumer features as having been "first explored" in Astra.^[2]^[6]

The clearest example is Gemini Live, the conversational voice mode in the Gemini app. Google announced Gemini Live at its August 13, 2024 hardware event as a low-latency, free-flowing voice interface, initially for paying subscribers.^[8] The real-time camera and screen-sharing input that Astra demonstrated then made its way into Gemini Live: those visual features began rolling out to subscribers on Android around March 2025, were made free for Android users in April, and were extended free to both Android and iOS at I/O 2025, with the iPhone rollout starting May 20.^[9]^[10] Google described each of these visual capabilities as powered by Project Astra.^[9]

Astra also fed the broader Gemini 2.0 push into agentic AI. In the December 11, 2024 Gemini 2.0 announcement, Google detailed an upgraded Astra prototype and unveiled a sibling research project, Project Mariner, an agent that operates a web browser to complete tasks.^[5] The December update gave Astra up to 10 minutes of in-session memory plus recall of some earlier conversations, the ability to converse in multiple and mixed languages with better handling of accents, native tool use across Search, Lens and Maps, and lower-latency streaming audio.^[5] Google also began testing Astra on prototype glasses with a small group of trusted testers, alongside the Android phone testers already trying it.^[5]

What changed at I/O 2025?

At Google I/O on May 20, 2025, Google folded Project Astra into a larger pitch: turning the Gemini app into a universal AI assistant. Hassabis again set the goal as building "a universal AI assistant that will perform everyday tasks for us," and the company tied Astra's research to its work extending Gemini toward a "world model" that can plan and reason about the physical environment.^[6]^[2]

The 2025 updates concentrated on three areas. Voice output was upgraded to more natural-sounding native audio. Memory was improved well beyond the early prototype, which Google and press noted had originally retained context only briefly, so the assistant could carry details across a longer interaction; in one demo it recalled a pet's name mentioned earlier in order to recommend a suitable bike basket. And Astra gained computer control, letting it take actions, navigate apps and interfaces, look things up, and even place phone calls to businesses to complete a multi-step request such as fixing a bike.^[6]^[7]

Google said these capabilities would arrive across several surfaces over time rather than as one product: enhanced versions of Gemini Live, new experiences in Search such as a live-camera "Search Live" feature in AI Mode, an updated Live API for developers with audio-visual input and native audio output, and new form factors including Android XR glasses.^[6]^[7] On the hardware side, Google said it was developing glasses with partners Samsung and Qualcomm, and named eyewear brands Warby Parker and Gentle Monster as consumer partners to make the frames look like ordinary glasses.^[11] The DeepMind site lists a Trusted Tester program, including recruitment of blind and low-vision users, as part of how it continues to develop the prototype.^[2]

Is Project Astra available to use?

Project Astra is not a downloadable app and is not available to the general public as a standalone product. It remains a research prototype tested by a limited group, while its individual capabilities ship inside other Google products. The most widely available of those is the camera and screen-sharing feature in Gemini Live, which Google made free to Android users in April 2025 and extended to iPhone the following month.^[9]^[10] More ambitious pieces, such as full agentic computer control and the Android XR glasses, were still early and largely confined to testers as of mid-2025.^[6]^[9]

How was Project Astra received?

Project Astra was widely judged the standout of Google I/O 2024, with coverage casting it as Google's credible response to OpenAI's GPT-4o and praising the smooth, low-latency video demo even as reporters cautioned that a polished one-take clip is not the same as a shipping product.^[3]^[4] Commentary in 2024 and 2025 tended to track the same tension: enthusiasm that Astra pointed toward a genuinely new kind of always-on, see-and-hear assistant, set against skepticism about how much of the demo would translate into reliable everyday use and on what timeline.^[3]^[6] By I/O 2025, observers noted that the once-speculative prototype was steadily materializing inside real Google products, particularly Gemini Live, even though the most ambitious pieces, agentic control and the smart-glasses hardware, remained early and largely confined to testers.^[6]^[9]

References

"Google's Gemini updates: How Project Astra is powering some of I/O's big reveals", TechCrunch, May 14, 2024. ↩
"Project Astra", Google DeepMind. ↩
"Google unveils Project Astra, the future of AI assistants", MobileSyrup, May 14, 2024. ↩
"Project Astra AI assistant steals the show at Google I/O, and it works with smart glasses", Tom's Guide, May 14, 2024. ↩
"Introducing Gemini 2.0: our new AI model for the agentic era", Google, December 11, 2024. ↩
"Gemini as a universal AI assistant", Google, May 20, 2025. ↩
"Project Astra comes to Google Search, Gemini, and developers", TechCrunch, May 20, 2025. ↩
"Made by Google 2024: All of Google's reveals", TechCrunch, August 13, 2024. ↩
"Gemini Live Astra camera & screen sharing rollout starts on Android", 9to5Google, March 22, 2025. ↩
"Gemini Live camera and screen sharing more widely rolling out to iPhone", 9to5Google, May 29, 2025. ↩
"Google I/O 2025 Highlights: Gemini, XR Glasses & Project Astra", Techweez, May 20, 2025. ↩

Improve this article

Add missing citations, update stale details, or suggest a clearer explanation. Every suggestion is reviewed for sourcing before it goes live.

1 revision by 1 contributors · full history

Suggest edit

What links here

Gemini 2.0 Flash Gemini Live Project Mariner

What is Project Astra?

What was shown at the I/O 2024 demonstration?

How does Project Astra work?

How does Project Astra reach Google products?

What changed at I/O 2025?

Is Project Astra available to use?

How was Project Astra received?

References

Improve this article

Related Articles

ERQA

PaLM-E: An Embodied Multimodal Language Model

SmolVLA

Gemini 3

Gemma 3

Gemini 2.5 Flash

What links here

Related Articles

ERQA

PaLM-E: An Embodied Multimodal Language Model

SmolVLA

Gemini 3

Gemma 3

Gemini 2.5 Flash

What links here