Amazon Nova Act
Last reviewed
Sources
5 citations
Review status
Source-backed
Revision
v1 · 1,561 words
Improve this article
Add missing citations, update stale details, or suggest a clearer explanation.
Last reviewed
Sources
5 citations
Review status
Source-backed
Revision
v1 · 1,561 words
Add missing citations, update stale details, or suggest a clearer explanation.
Amazon Nova Act is an agentic AI model and developer software development kit (SDK) from Amazon for building AI agents that reliably perform multi-step actions inside a web browser. It is a computer-use agent trained to break complex web workflows into small, testable "atomic" commands and execute them on real websites, and the accompanying SDK lets developers interleave those commands with ordinary Python code, assertions, and tests. Amazon announced Nova Act as a research preview on March 31, 2025, as the first public product from the Amazon AGI SF Lab, the San Francisco research group led by David Luan.[1][2] The technology also underpins action-taking capabilities in Alexa+, Amazon's revamped generative-AI assistant.[2] On December 2, 2025, Amazon made Nova Act generally available as a managed AWS service powered by a custom model from the Amazon Nova family.[3]
Nova Act addresses a recurring weakness of large language model agents: while models can describe a plan to navigate a website, they frequently fail partway through a long sequence of clicks, form fills, and page transitions. Rather than asking a single prompt to "complete the whole task," Nova Act encourages developers to decompose a workflow into discrete, prescriptive actions such as search, checkout, or "pick a date on the calendar," each of which the model is optimized to execute reliably.[1] These atomic actions can then be chained, wrapped in conditional logic, and verified with assertions, which makes long workflows more dependable and easier to debug.
Amazon framed the project around reliability on real web tasks rather than scripted demonstrations. The company argued that agents need to be "trusted to do the right thing" before they can be deployed in production, and positioned Nova Act as a step toward that goal.[1] The model emerged from Amazon's AGI organization, the same group that released the Amazon Nova foundation models, and is distinct from the AWS Bedrock model catalog: the original research-preview SDK used its own credentials separate from AWS accounts.[1]
Nova Act is the first public product to come out of the Amazon AGI SF Lab, a dedicated research team based in San Francisco that Amazon established to develop foundational capabilities for AI agents that can take actions in the digital and physical worlds.[2] The lab is led by David Luan, who holds the title of vice president of Autonomy, and is co-led with Pieter Abbeel.[2] Both are former OpenAI researchers: Luan previously ran research and engineering at OpenAI and led Google's large language model effort before co-founding the agent startup Adept in 2022, while Abbeel co-founded the robotics company Covariant.[2]
Much of the lab's leadership and technical talent joined Amazon in 2024 through an "acqui-hire" arrangement that brought Luan and other senior figures from Adept into the company.[2] In February 2026, Luan announced on LinkedIn that he was leaving Amazon "to cook up something new," marking another departure from the high-profile Adept deal. Following his exit, AWS executive Peter DeSantis took over Amazon's efforts to build frontier AI models; Rohit Prasad, who had overseen the broader AGI organization, had already left at the end of 2025.[4]
Nova Act has two parts: an action model and an SDK.
The model is a multimodal system further trained to plan and execute multi-step actions in a browser. Amazon describes it as optimized for high reliability on atomic actions such as searching for an item in a catalog or list, and it was trained using reinforcement learning across diverse environments as the early phase of a longer training curriculum.[1]
The SDK lets developers build browser-automation agents using a hybrid of natural-language instructions and code. Because the natural-language act() calls live inside Python, developers can interleave them with deterministic logic, tests, breakpoints, and assertions, run agents in parallel, and operate in headless mode for background automation.[1] The SDK integrates with Playwright, the browser-automation framework, so developers can drop down to direct browser manipulation (for example, to enter passwords or handle steps that should not be left to the model) and call conventional APIs where those are more reliable than acting through the user interface.[1] Example use cases highlighted at launch included submitting an out-of-office request in an internal system, ordering salads from Sweetgreen, and making dinner reservations.[1][2]
Amazon reported that Nova Act achieved best-in-class results on perception and UI-grounding benchmarks measuring how accurately a model can locate and act on screen elements. On the ScreenSpot Web Text benchmark it scored 0.939, and on ScreenSpot Web Icon it scored 0.879, ahead of Anthropic's Claude 3.7 Sonnet (0.900 and 0.854) and OpenAI's Computer-Using Agent, the model behind OpenAI Operator (0.883 and 0.806).[1] On the GroundUI Web benchmark Nova Act scored 0.805, roughly even with Claude 3.7 Sonnet (0.825) and OpenAI's agent (0.823).[1] Amazon also said the model exceeded 90 percent on internal evaluations of historically difficult interactions such as date pickers and dropdown menus.[1]
These figures are Amazon's own reported numbers and should be read with that attribution. Notably, Amazon did not benchmark the research preview against more common end-to-end web-agent evaluations such as WebVoyager.[1] For the later generally available service, Amazon reported strong results on the WorkArena L1, REAL Bench V1, and REAL Bench V2 agent benchmarks, and said the service reached the top of the REAL Bench leaderboard, with detailed methodology published on Hugging Face.[3]
The following table summarizes key facts and reported benchmark figures.
| Attribute | Detail |
|---|---|
| Full name | Amazon Nova Act |
| Developer | Amazon AGI SF Lab (Amazon) |
| Type | Browser-action (computer-use) AI model plus developer SDK |
| Research preview announced | March 31, 2025[1][2] |
| General availability (AWS service) | December 2, 2025[3] |
| Underlying model (GA service) | Custom Amazon Nova 2 Lite model[3] |
| Key integrations | Playwright, Python; AWS for the GA service[1][3] |
| ScreenSpot Web Text (reported) | 0.939 (vs Claude 3.7 Sonnet 0.900, OpenAI CUA 0.883)[1] |
| ScreenSpot Web Icon (reported) | 0.879 (vs Claude 3.7 Sonnet 0.854, OpenAI CUA 0.806)[1] |
| GroundUI Web (reported) | 0.805 (vs Claude 3.7 Sonnet 0.825, OpenAI CUA 0.823)[1] |
| Reliability claim (GA) | About 90 percent on UI-based customer workflows[3] |
| Notable customers | Hertz, 1Password, Sola Systems, Amazon[3] |
| Powers | Action features in Alexa+[2] |
Nova Act sits within Amazon's broader Amazon Nova family of foundation models, which Amazon introduced at AWS re:Invent in December 2024. That family includes the text and multimodal "understanding" models Nova Micro, Nova Lite, Nova Pro, and Nova Premier, along with the Nova Canvas image generator, the Nova Reel video generator, and the Nova Sonic speech model.[5] When Nova Act became generally available in December 2025, Amazon disclosed that the service is powered by a custom version of Nova 2 Lite, a small, fast multimodal model in the Nova family trained specifically for the agent's task, and delivered as a unified system spanning the model, SDK, orchestrator, and browser controllers.[3]
Nova Act is also a foundation for Amazon's consumer agent ambitions. The company has said the technology helps power Alexa+, the generative-AI version of its Alexa voice assistant, where it is used to navigate the web in a self-directed way and complete tasks when direct API integrations are unavailable.[1][2] This connects a developer-facing research tool to a mass-market product used by millions of households.
Nova Act represents a concrete entry by Amazon into the competitive market for computer-use and web-automation agents, alongside OpenAI Operator and its Computer-Using Agent, Anthropic's computer use capability, and Google DeepMind's Project Mariner.[2] Rather than competing primarily on flashy autonomous demos, Amazon's pitch centered on engineering for reliability: decomposing tasks into testable units, blending natural language with code, and giving developers the tools to verify behavior before deployment.
That emphasis carried into the product's evolution. Over 2025 the offering moved from a research-preview SDK to an IDE extension and then, on December 2, 2025, to a generally available AWS service that Amazon said delivers roughly 90 percent reliability on UI-based workflows, with early adopters including the car-rental company Hertz, which reported accelerating QA testing by about five times, password manager 1Password, and automation firm Sola Systems running hundreds of thousands of operations per month.[3] At the same time, the February 2026 departure of David Luan and the reorganization of Amazon's AGI leadership underscored the volatility around the high-profile Adept hires even as the technology they helped build shipped into production.[4] As an example of the industry shift from chat-based assistants toward agents that take real actions, Nova Act illustrates both the promise and the engineering challenges of making web automation dependable enough for production use.