OpenAI Codex Cloud
Last reviewed
May 16, 2026
Sources
14 citations
Review status
Source-backed
Revision
v1 ยท 2,589 words
Improve this article
Add missing citations, update stale details, or suggest a clearer explanation.
Last reviewed
May 16, 2026
Sources
14 citations
Review status
Source-backed
Revision
v1 ยท 2,589 words
Add missing citations, update stale details, or suggest a clearer explanation.
OpenAI Codex Cloud is a hosted, cloud-based software engineering agent developed by OpenAI and launched as a research preview on May 16, 2025.[1][2] The service runs at chatgpt.com/codex inside ChatGPT and lets users delegate coding tasks to an autonomous agent that operates inside a sandboxed virtual machine, with a copy of the user's GitHub repository preloaded as the working tree.[2][3] Each task runs in its own isolated container, and the agent can write features, answer questions about a codebase, fix bugs, run tests, and open pull requests for human review.[2]
Codex Cloud is the asynchronous, server-side half of OpenAI's 2025 Codex product family. It shares the Codex brand with the open-source OpenAI Codex CLI, launched a month earlier on April 16, 2025, but the two products are technically distinct: the CLI runs locally on a developer's own machine, while Codex Cloud runs inside OpenAI infrastructure and is intended for delegated work that the developer launches and walks away from.[3][4] Codex Cloud is also unrelated, beyond branding, to the original 2021 Codex model, a GPT-3 derivative that powered the first generation of GitHub Copilot.[5]
At launch Codex Cloud was powered by codex-1, a version of the o3 reasoning model fine-tuned on real-world coding tasks via reinforcement learning.[2] The product was promoted to general availability on October 6, 2025 at OpenAI DevDay, where Sam Altman also introduced a refreshed model branded GPT-5-Codex.[6][7] By March 2026 OpenAI reported that Codex, taken as a whole, had grown to over 2 million weekly active users, with usage roughly five times higher than at the start of the year.[8]
The Codex name has a layered history at OpenAI. The original Codex was a code-specialised version of GPT-3 released in July 2021. It powered the first generation of GitHub Copilot and was retired as a standalone API in March 2023, with its capabilities folded into later general-purpose models.[5] The brand sat dormant for two years before OpenAI reused it in April 2025 for a coordinated push into agentic coding.
Three separate Codex products shipped within weeks of each other. The OpenAI Codex CLI, an open-source terminal agent, launched on April 16, 2025 alongside the o3 and o4-mini reasoning models and GPT-4.1.[4] One month later, on May 16, 2025, OpenAI announced Codex Cloud, the hosted multi-agent service that this article covers.[2] A Codex IDE extension for VS Code, Cursor, and Windsurf, which connects local editors to the cloud sandbox, rolled out later in 2025.[3] The three products are meant to be complementary surfaces onto the same underlying Codex agent.
The split between local and cloud reflected an explicit philosophical bet. Local terminal agents like the CLI sit close to the developer, run inside the developer's working copy, and ask for approval before destructive actions; their value is in tight, interactive loops. Cloud agents, by contrast, are better suited to longer-running tasks where the developer wants to fire off work and come back to a finished pull request later. OpenAI framed Codex Cloud as a way to "delegate to Codex in the cloud" so it could "work on tasks in the background, including in parallel, using its own cloud environment."[3] The pitch was less about pair programming and more about treating the agent as a junior engineer to whom you assign tickets.
Codex Cloud is reachable at chatgpt.com/codex once a user signs in to a qualifying ChatGPT plan and connects a GitHub account.[2][9] The web interface lists ongoing tasks on the left side of the screen and shows the agent's working state, including terminal output, file diffs, and final pull request links, in the main pane.[2] Users can give the agent natural-language instructions such as "add a JWT-based authentication middleware" or "investigate why the staging build fails on Node 22," and the agent will plan and execute the work asynchronously.[2]
Every task runs in its own freshly provisioned container, often referred to as a microVM. The container has no access to the public internet by default and no access to external services beyond what the user has explicitly authorised in the environment configuration.[3][9] Inside the sandbox, the agent clones the user's GitHub repository at the chosen branch or commit, reads the file tree to build context, and then begins iterating on the task.[2][9] The agent can run shell commands, install dependencies during a configurable setup phase, edit files, and run tests, with the entire transcript exposed to the user so each step can be audited after the fact.[2]
The environment is highly configurable. Each repository can carry its own setup script, language toolchain, and list of pre-installed tools, and administrators can decide whether the container is allowed any outbound internet access at all.[3][9] OpenAI also restricts the agent from talking to systems outside its sandbox during the actual task phase, which is one of the more visible safety controls; package installation typically has to happen during the setup phase rather than mid-task.[10] When the agent decides a task is complete, it commits its changes to a new branch in the user's repository and opens a pull request, with the conversation log and a summary of changes attached for review.[2]
Delegation is also possible from outside the web app. Users can tag @codex on GitHub issues and pull requests to spin up cloud tasks directly from the GitHub interface, and the Codex CLI's codex cloud subcommand can dispatch and triage cloud results without leaving the terminal.[9][4] The IDE extension exposes the same task queue inside VS Code, Cursor, and Windsurf, so a developer can launch a long-running cloud job and continue editing locally while it runs.[3]
Codex Cloud has been powered by a sequence of models, all branded under the Codex name, that OpenAI has swapped in over time as the underlying ChatGPT model family has advanced.
| Model | Released | Underlying base | Notes |
|---|---|---|---|
| codex-1 | May 2025 | o3 reasoning model | Launch model for Codex Cloud. Trained with reinforcement learning on real coding tasks to mimic human pull request style and to iterate on tests until they pass.[2] |
| codex-mini-latest | May 2025 | o4-mini | Lighter, faster sibling intended for the OpenAI Codex CLI and Responses API rather than for Codex Cloud's heavy delegated work. Available via API at $1.50 per million input tokens and $6.00 per million output tokens with a 75% prompt caching discount.[11] |
| GPT-5-Codex | October 2025 | GPT-5 | Introduced at DevDay 2025 as the general availability model. Sam Altman described it as "a version of GPT-5 purposely trained for Codex and agentic coding," with particular focus on code refactoring and code review.[6][7] It dynamically scales reasoning time to the complexity of the task.[7] |
A distinctive design choice across all three models is that the user does not pick which one will run a given task. Codex Cloud routes the request internally based on task complexity and repository size, which some developers have flagged as a frustration because it removes a lever they are used to having in chat products.[12]
Codex Cloud is bundled into qualifying ChatGPT subscriptions rather than billed as a standalone product. The initial May 2025 research preview rolled out to ChatGPT Pro, Team, and Enterprise users, with Plus access following in early June 2025.[2][13] By the time of the October 2025 general availability announcement, access had broadened across the full set of paid plans.[6]
| Plan | Codex Cloud access | Notes |
|---|---|---|
| ChatGPT Free | Not included | Codex Cloud requires a paid plan.[3] |
| ChatGPT Plus | Included | Added during the June 2025 expansion of the research preview, alongside Edu.[13] |
| ChatGPT Pro | Included | Day-one access at the May 16, 2025 launch.[2] |
| ChatGPT Team / Business | Included | Day-one access at launch; admin configuration may be required in larger workspaces.[2][3] |
| ChatGPT Enterprise | Included | Day-one access; workspace admins may need to enable the connector before users can sign in to chatgpt.com/codex.[3] |
| ChatGPT Edu | Included | Added shortly after the initial preview.[13] |
Individual usage limits inside Codex Cloud are tied to the underlying ChatGPT plan rather than to a separate Codex meter, and most surfaces share the same task quota.[3] Developers who want to build on the Codex models directly outside of the web app can do so through the API: codex-mini-latest, for example, is exposed on OpenAI's Responses API for use in custom agentic tools.[11]
The core use case for Codex Cloud is asynchronous, repository-scoped engineering work that a developer is comfortable handing off and reviewing later. Because the agent opens a pull request rather than mutating a working copy in place, the dominant interaction pattern resembles assigning tickets to a junior teammate.
| Use case | What the agent does | Why the cloud sandbox helps |
|---|---|---|
| Feature implementation | Reads existing files, drafts the new code, runs the test suite, opens a PR.[2] | The sandbox lets the agent run tests and iterate without touching the developer's local environment. |
| Bug investigation and fixes | Reproduces the bug, narrows down the cause, proposes a patch, attaches the reasoning trace to the PR.[2] | Long-running reproduction steps can run unattended in the cloud while the developer works on something else. |
| Codebase Q&A | Answers questions about how a repository works by reading the code and writing a short explanation.[2] | The agent has a fresh, clean copy of the repository and can grep through it without prior context. |
| Refactoring | Carries out structured changes across many files, such as renaming an API or migrating a deprecated library.[7] | GPT-5-Codex was specifically tuned for refactoring and code review tasks, with scaled reasoning time on harder cases.[7] |
| Code review | Reviews open pull requests, explores dependencies, flags suspected bugs.[7] | Reviewing a PR is naturally a server-side, full-repository task, well matched to the cloud sandbox model. |
| Issue triage via GitHub | Picks up issues tagged with @codex and proposes a fix as a PR without ever opening the chatgpt.com/codex UI.[9] | GitHub-native triggers let the agent act on work that arrives outside the ChatGPT app. |
Developers commonly use Codex Cloud in parallel with the OpenAI Codex CLI: the CLI for interactive, eyes-on work in the local terminal, and Codex Cloud for longer-running jobs that can run unattended. The IDE extension stitches the two together inside VS Code, Cursor, and Windsurf, exposing the cloud task queue from inside the editor.[3]
Codex Cloud sits in a crowded 2025 field of autonomous coding agents. The most direct comparisons are to the cloud agents from Cognition, Factory, and a handful of startups that emphasise asynchronous delegation over interactive pair programming.
| Product | Vendor | Deployment | Notable positioning |
|---|---|---|---|
| Codex Cloud | OpenAI | Cloud sandbox at chatgpt.com/codex, plus GitHub @codex mentions, IDE extension, and CLI handoff.[2][9] | Bundled into ChatGPT paid plans; powered by codex-1 then GPT-5-Codex.[6] |
| Devin | Cognition AI | Hosted agent with its own web interface and Slack integration.[14] | Billed by Agent Compute Unit at $2.25 per ACU as of 2025; positioned by reviewers as a "remote contractor" delegation model.[14] |
| Factory droids | Factory AI | Cloud agent platform aimed at enterprise software teams.[14] | Differentiated by tighter integration with engineering metadata and team workflows than a single-developer chat product. |
| OpenAI Codex CLI | OpenAI | Local terminal agent on the developer's machine.[4] | Same Codex brand, different deployment philosophy: interactive and local rather than asynchronous and remote. |
A peer-reviewed 2025 evaluation cited by industry coverage placed Codex's pull request acceptance rate at 64%, ahead of Devin at 49% and GitHub Copilot at 35% on the agent tasks it tested.[14] The comparison between Codex Cloud and Devin in particular is one of approach as much as accuracy. Devin leans further toward fire-and-forget delegation through its own sandbox, while Codex Cloud emphasises tight integration with GitHub, ChatGPT, and OpenAI's IDE and CLI surfaces.[14]
Coverage of the May 2025 research preview was broadly positive but skeptical in the details. Reviewers praised the cloud sandbox model and the codex-1 model's pull-request style: codex-1 was reported to produce cleaner patches than o3 used directly, with output that more closely matched human PR conventions.[2] Reviewers also flagged the fact that each task runs in a fresh, isolated microVM as one of the stronger safety stories among the new wave of cloud coding agents.[2]
Developer feedback in the months that followed was more mixed. On Hacker News and other developer forums, early users described the experience in terms that ranged from "pretty magical" on well-scoped tasks to "like dealing with an incredibly talented intern who still manages to screw up simple tasks occasionally" on ambiguous ones.[12] Specific frustrations clustered around four points: the agent's default lack of internet access in the sandbox, which complicated dependency management for projects with unusual setup steps; awkward iteration loops where Codex sometimes overwrote correct code in response to ambiguous follow-up instructions; the inability for users to pick a model directly, since Codex Cloud routes internally; and worries about pricing as the research preview transitioned to a paid product.[12]
The October 6, 2025 DevDay marked the formal move from research preview to general availability. In his keynote, Sam Altman positioned the upgraded Codex as one of the most important announcements of the event, describing GPT-5-Codex as "a version of GPT-5 purposely trained for Codex and agentic coding" and emphasising its strength at code review.[6][7] By March 2026, OpenAI publicly cited Codex usage at over 2 million weekly active users, about five times higher than at the start of the year, and GitHub had added Codex to its Agent HQ surface so that Copilot subscribers could assign tasks to the agent from inside GitHub itself.[8]
The broader take from coverage in late 2025 and early 2026 was that Codex Cloud had legitimised the idea of treating coding agents as background workers rather than chat partners, even if the rough edges around dependency management, model selection, and ambiguity tolerance remained real. The bundling decision (Codex Cloud included with ChatGPT paid plans rather than billed separately) also reshaped pricing expectations across the wider autonomous coding agent category.[13]