Playwright MCP
Last reviewed
May 24, 2026
Sources
No citations yet
Review status
Needs citations
Revision
v1 · 3,573 words
Improve this article
Add missing citations, update stale details, or suggest a clearer explanation.
Last reviewed
May 24, 2026
Sources
No citations yet
Review status
Needs citations
Revision
v1 · 3,573 words
Add missing citations, update stale details, or suggest a clearer explanation.
Playwright MCP is an open-source model context protocol server developed and maintained by microsoft that lets large language model agents control real web browsers through the Playwright cross-browser automation library. Released by the Playwright team on 22 March 2025 at the GitHub repository microsoft/playwright-mcp, it exposes a set of MCP tools for navigation, clicking, typing, form filling, screenshot capture, network interception, and reading page state.[^1][^2] The server distinguishes itself from screenshot-only desktop control agents by operating primarily on the browser accessibility tree, which produces compact structured snapshots that LLMs parse with a fraction of the tokens required for full DOM dumps or pixel-based vision.[^3] As of May 2026 the package is published on npm as @playwright/mcp and is wired into clients including claude code, Claude Desktop, cursor, windsurf, and Microsoft Visual Studio Code in GitHub Copilot agent mode.[^1][^4]
The Model Context Protocol is an open standard announced by Anthropic on 25 November 2024 to give LLM applications a uniform way to discover and call external tools, fetch data, and read prompts from outside systems.[^5] The initial release shipped with reference servers for Google Drive, Slack, GitHub, Git, Postgres, and Puppeteer along with SDKs in Python, TypeScript, C#, and Java.[^5][^6] OpenAI announced official adoption of MCP across its Agents SDK, Responses API, and ChatGPT desktop app in March 2025, and Google followed with native support in Gemini and a set of managed MCP servers for Maps, BigQuery, Compute Engine, and Kubernetes Engine later in 2025.[^6][^7] By the end of 2025 Anthropic donated the protocol to the Linux Foundation through the new Agentic AI Foundation, with Block and OpenAI as co-founders.[^6]
Playwright itself predates the MCP project by several years. Microsoft released the Playwright 1.0 cross-browser test automation framework on 6 May 2020, after a public preview that began on 31 January 2020.[^8] The library was built by a team that included engineers who had previously worked on Google's Puppeteer headless Chrome project; the design goal was a single API that drives Chromium, Firefox, and WebKit with the same code, with built-in handling of timing flakiness, network interception, and parallel test execution.[^8] By 2026 the microsoft/playwright repository had passed 89,000 GitHub stars, with bindings for JavaScript, TypeScript, Python, .NET, and Java.[^9]
The combination of these two threads, an open agent protocol and a mature cross-browser driver, made an official MCP wrapper around Playwright a natural step. Microsoft's Playwright team shipped the first public version of playwright-mcp on 22 March 2025, three days before independent developer Simon Willison covered the launch on his weblog and described the implementation as "pretty fascinating" because Claude Desktop could now drive a browser by reading the Chrome accessibility tree rather than by analyzing screenshots.[^2][^10]
| Date | Version | Notes |
|---|---|---|
| 22 March 2025 | First public release | Initial publish of @playwright/mcp on npm; microsoft/playwright-mcp repository made public.[^1][^11] |
| March-April 2025 | v0.0.x point releases | Rapid iteration on tool naming, snapshot format, and configuration flags. |
| 6 February 2026 | v0.0.64 | Expanded tab management and vision tools.[^11] |
| 14 February 2026 | v0.0.67 / v0.0.68 | Added storage-state and tracing utilities.[^11] |
| 30 March 2026 | v0.0.69 | Documentation reorganization.[^11] |
| 7 May 2026 | v0.0.75 | Latest version at time of writing.[^11] |
The version numbers stayed in the 0.0.x range through 65 releases, reflecting a deliberate decision by the maintainers to keep the package marked as pre-1.0 while the tool surface stabilized.[^11] Despite the pre-release labeling, the GitHub repository reached more than 32,000 stars within roughly a year of launch.[^12]
Playwright MCP is a Node.js process. When an MCP client launches it, the server creates a Playwright Browser instance, exposes a list of MCP tools over a transport, and translates each tool call into a Playwright API invocation against an open page or tab.[^1][^13]
The server supports two transports drawn from the MCP specification.
command: "npx" and args: ["@playwright/mcp@latest"].[^4][^13]--port flag. The server then listens on /mcp and /sse endpoints, accepts POST requests from clients, and streams responses back as SSE events. Each client gets its own session identifier, allowing one server process to handle multiple concurrent agents.[^13]The transport abstraction means the same tool implementations run whether the agent is local or remote, which is how the official Playwright MCP Chrome Extension attaches to an existing browser session over a WebSocket bridge.[^1]
Because Playwright MCP wraps Playwright, it inherits support for Chromium, Firefox, and WebKit on Linux, macOS, and Windows.[^9][^13] By default it launches Chromium in headed mode, which is the inverse of the Playwright CLI default of headless, and it persists cookies and local storage in a user data directory between sessions so that login state survives across agent invocations.[^4][^14] Command line flags select alternative browsers, isolated incognito profiles, executable paths for branded builds of Chrome or Edge, and viewport dimensions.[^1][^14]
A defining design choice is the way the agent perceives the page.
e1, e2, e3 that the agent uses to address that element in subsequent tool calls. The accessibility tree is the same hierarchy that screen readers and assistive technology consume, with roles, names, and parent-child relationships rather than pixel coordinates.[^3][^15]Public benchmarks from the Playwright documentation report that a snapshot consumes roughly 200 to 400 tokens for a typical page, against several thousand for a raw DOM dump or a base64-encoded screenshot.[^3] Third-party analysis by Morph notes that an entire snapshot-driven workflow can still climb into six figures of tokens over a long session because the server returns a fresh snapshot after every action, which is why MCP usage is often paired with CLI-style on-disk snapshots for production test runs.[^16]
The tool catalog in v0.0.75 contains about 40 named tools grouped into several capability families. The Playwright documentation organizes them into four buckets: Core, Network and Storage, Testing and Debugging, and Vision.[^14][^15]
| Tool | Purpose |
|---|---|
browser_navigate | Open a URL in the active tab.[^14] |
browser_navigate_back, browser_navigate_forward | Move through browser history.[^14] |
browser_snapshot | Capture the accessibility tree of the current page and return it with element refs.[^15] |
browser_click | Click an element identified by its snapshot ref.[^15] |
browser_type | Type text into a focused or referenced field, with submit-on-enter as an option.[^15] |
browser_fill_form | Fill a structured form mapping fields to values in a single call.[^14] |
browser_press_key | Send a keyboard key, including modifier combinations.[^14] |
browser_hover | Hover over a referenced element.[^14] |
browser_drag | Drag from one ref to another.[^14] |
browser_select_option | Pick a value in a select element by label or value.[^14] |
browser_handle_dialog | Accept or dismiss native browser dialogs.[^14] |
browser_wait_for | Block until a referenced element appears, disappears, or a text condition is met.[^14] |
browser_tab_new, browser_tab_close, browser_tab_list, and browser_tab_select give the agent control over multiple tabs in the same browser instance, which is useful for cross-site workflows like comparing prices on two storefronts.[^14]
browser_take_screenshot returns a PNG or JPEG of the current viewport or a specific element. browser_pdf_save writes the current page to PDF, primarily for archival or test reporting. The vision profile adds coordinate-based tools browser_mouse_move_xy, browser_mouse_click_xy, and browser_mouse_drag_xy for cases where the accessibility tree is not enough.[^14][^15]
| Tool | Purpose |
|---|---|
browser_network_requests | List requests issued since the last snapshot.[^14] |
browser_set_cookies / browser_get_cookies | Read and write cookies.[^14] |
browser_set_local_storage | Inject local storage values, often used to bypass login flows in test harnesses.[^14] |
browser_route | Mock or intercept specific network responses, useful for offline testing.[^14] |
browser_console_messages returns recent console output. browser_evaluate runs arbitrary JavaScript in the page context and returns the serializable result. browser_generate_playwright_test emits a runnable Playwright test script that reproduces the actions the agent has taken so far, which converts an exploratory session into a regression test.[^14]
The exact tool names and shapes have evolved through the v0.0.x series. Older clients and tutorials sometimes refer to tools by earlier names; the canonical list lives in the tools directory of the GitHub repository and in the Playwright MCP documentation site.[^1][^14]
An agent that wants to drive Playwright MCP follows the standard MCP client pattern: register the server in a configuration file, list the available tools at session start, and call them through the model's tool-use loop.[^4][^17]
For Claude Code, the registration is a single shell command that writes to ~/.claude.json:
claude mcp add playwright npx @playwright/mcp@latest
For most other clients, the configuration is a JSON snippet that names the server and gives the command to spawn it.[^4][^14]
{
"mcpServers": {
"playwright": {
"command": "npx",
"args": ["@playwright/mcp@latest"]
}
}
}
This same snippet works for Cursor, Windsurf, Claude Desktop, and VS Code's GitHub Copilot agent mode, which Microsoft added in VS Code 1.99 alongside MCP client support in March 2025.[^14][^18]
A typical agent turn looks like this. The user asks the model to find the cheapest flight from San Francisco to Tokyo next month. The model calls browser_navigate with the Google Flights URL, then browser_snapshot to read the page. The snapshot returns a YAML accessibility tree with refs for the origin input (ref=e23), the destination input (ref=e24), the date pickers, and the search button. The model calls browser_type with ref=e23 and text SFO, then browser_type on ref=e24 with text Tokyo, then browser_click on the search button. After the page loads, another browser_snapshot returns the result list, and the model reads off the prices. Because refs are scoped to a single snapshot, the model is expected to take a fresh snapshot after any action that changes the DOM.[^15][^19]
The original Anthropic launch shipped a reference Puppeteer MCP server, which served as a proof of concept rather than a production tool.[^5] After Microsoft's official release, the community converged on Playwright MCP as the default browser server, although alternatives exist with different trade-offs.
| Server | Maintainer | Engine | Notes |
|---|---|---|---|
| Playwright MCP | Microsoft | Playwright | Official, accessibility-tree first, headed by default, 40+ tools.[^1][^14] |
| Puppeteer MCP | Anthropic reference | Puppeteer | Initial launch example, Chromium only, smaller surface.[^5] |
ExecuteAutomation mcp-playwright | Community | Playwright | Adds API testing helpers and Cline support; predates the official server.[^20] |
| Browserbase MCP | Browserbase | Headless cloud Chrome | Runs the browser in Browserbase's hosted infrastructure for proxy and CAPTCHA support.[^21] |
| Hyperbrowser | Hyperbrowser.ai | Headless cloud | Similar cloud-runner model with stealth and session features.[^21] |
The choice between local Playwright MCP and a hosted browser MCP usually comes down to whether the agent needs the user's cookies and login state (local wins) or whether it needs an IP that is not the user's home address and resilience to bot detection (cloud wins).
anthropic computer use is a tool family launched by Anthropic in public beta on 22 October 2024 that lets Claude move a mouse, type, and read the screen of an arbitrary desktop environment by analyzing screenshots.[^22] openai operator is OpenAI's competing system, released on 23 January 2025 in a research preview, built on the Computer-Using Agent model that combines GPT-4o vision with reinforcement learning.[^23] Both work by taking screenshots and emitting coordinate-level mouse and keyboard actions, and both can drive any GUI rather than just a browser.
Playwright MCP occupies a narrower but more reliable slice of that space. By targeting only the browser and only the accessibility tree, it avoids the cost and brittleness of vision while keeping a fully open-source local-execution path. For tasks that live entirely on the web, such as form filling, data extraction from rendered pages, or end-to-end test generation, Playwright MCP is usually faster and cheaper per action than Computer Use or Operator.[^3][^16] For tasks that span native applications, file dialogs outside the browser, or operating system settings, desktop control agents remain the only option.
chatgpt agent is OpenAI's successor product that bundles the Operator browser, a code interpreter, and tool use into a single ChatGPT mode. It is closer in scope to a full virtual assistant than to a developer tool like Playwright MCP, but the underlying browser-driving primitive is similar.
browser-use agent is a Python library by Magnus Müller and Gregor Žunić that wraps Playwright in an LLM-driven control loop, originally released in late 2024. It is not an MCP server. The Python program embeds an LLM client, calls browser-use directly, and runs entirely in the developer's process.[^24] Playwright MCP and browser-use are sibling projects rather than competitors: developers who want a programmable Python loop pick browser-use, and developers who want any MCP-aware client to drive the browser pick Playwright MCP. Both ultimately call Playwright under the hood.
ai browser agent is the broader category. The space includes hosted products such as Adept's Workflow Language for action agents, Reka's browser tools, Arc Browser's smart features, and a long tail of open-source experiments. Playwright MCP is the lowest-common-denominator infrastructure piece that most of these can use as a backend if they choose to.
The Playwright MCP repository and developer write-ups describe a recurring set of applications.
browser_generate_playwright_test to emit a Playwright test() block that captures the steps as a runnable assertion. This converts manual exploration into a regression test without the engineer writing selectors by hand.[^14][^25]Several caveats appear repeatedly in third-party reviews and the documentation itself.
performance is not defined error at startup.[^25]--headless or run inside a virtual framebuffer like Xvfb.[^14]Playwright MCP became one of the most downloaded MCP servers within weeks of release. By mid-2026 it appeared at or near the top of MCP server directory rankings on aggregators such as PulseMCP, which estimated tens of millions of monthly visitors to its listing.[^12] Developer adoption was driven in part by direct integration paths in widely used clients. Anthropic published instructions for installing Playwright MCP in Claude Code, and Microsoft made it the default browser server for the GitHub Copilot agent mode shipped in VS Code 1.99 in March 2025.[^4][^18]
Reviews focused on three points. First, the accessibility-tree approach was widely judged to be the right trade-off for browser automation, with several authors noting that earlier attempts at vision-based web agents had failed in production because of token cost and screenshot noise.[^2][^16] Second, the snapshot reference system was praised for giving agents deterministic targets, an improvement over earlier patterns that asked the model to write CSS selectors directly.[^15] Third, critics pointed to enterprise gaps including CAPTCHA, multi-factor authentication, and the lack of a built-in workflow scheduler, and argued that Playwright MCP works best as plumbing inside a larger agent rather than as a complete automation product.[^26]