# ChatGPT Agent

> Source: https://aiwiki.ai/wiki/chatgpt_agent
> Updated: 2026-06-25
> Categories: AI Agents, ChatGPT, OpenAI
> From AI Wiki (https://aiwiki.ai), a free encyclopedia of artificial intelligence. Quote with attribution.

**ChatGPT Agent** is an agentic feature of [ChatGPT](/wiki/chatgpt) from [Sam Altman](/wiki/sam_altman)'s [OpenAI](/wiki/openai) that gives the chatbot its own virtual computer (complete with a graphical browser, a text browser, a Linux-style terminal, code execution, and connectors to external apps) so the model can complete multi-step tasks on a user's behalf rather than only producing text answers.[^1][^2] It launched on July 17, 2025, with a livestream introduction by Sam Altman alongside the project team of Casey Chu, Isa Fulford, Yash Kumar, and Zhiqing Sun, and it unified two earlier OpenAI agent products: the browsing-and-clicking assistant [OpenAI Operator](/wiki/openai_operator) (released January 23, 2025) and the long-running research tool [Deep Research](/wiki/deep_research) (released February 2, 2025).[^1][^2][^3][^4][^5]

OpenAI positioned ChatGPT Agent as a "unified agentic system" that "fluidly shift[s] between reasoning and action to handle complex workflows from start to finish," with example tasks including "look at my calendar and brief me on upcoming client meetings based on recent news," "plan and buy ingredients to make Japanese breakfast for four," and "analyze three competitors and create a slide deck."[^1][^2] At launch the feature was made available inside ChatGPT to subscribers on the Pro ($200/month), Plus ($20/month), and Team tiers, with Enterprise and Education access following in subsequent weeks; Pro subscribers received 400 agent messages per month and Plus and Team users received 40 per month.[^2][^6] The system card classified the underlying model as the first OpenAI release treated as "High capability" in the Biological and Chemical domain under OpenAI's Preparedness Framework, activating an unprecedented set of safety mitigations.[^7][^8][^9]

ChatGPT Agent introduced new agentic benchmark records at the time of release, including a state-of-the-art pass@1 score of 41.6% on [Humanity's Last Exam](/wiki/humanitys_last_exam) (44.4% with parallel attempts), 27.4% on [FrontierMath](/wiki/frontiermath) (Tiers 1-3) with tool use, 45.5% on SpreadsheetBench, 68.9% on [BrowseComp](/wiki/browsecomp), and 65.4% on [WebArena](/wiki/webarena).[^2][^10][^11] It nonetheless drew a cautious launch message from Altman, who described the product as "cutting edge and experimental ... a chance to try the future, but not something I'd yet use for high-stakes uses or with a lot of personal information until we have a chance to study and improve it in the wild."[^12][^13] The standalone Operator preview at operator.chatgpt.com was deprecated and shut down on August 31, 2025, with its capabilities absorbed into ChatGPT Agent.[^3][^14]

## Explain it like I'm 5

Normal ChatGPT can talk to you and write things, but it cannot do things. ChatGPT Agent gives the AI its own pretend computer with a web browser, a place to run code, and keys to some of your apps, so it can go off, click around websites, fill in forms, crunch numbers, and build a slideshow or spreadsheet for you, then come back and show you the result. You watch it work and have to say "yes" before it does anything important like spending money or sending an email.

## What is ChatGPT Agent?

ChatGPT Agent is a mode inside ChatGPT in which the model is given a sandboxed virtual computer and a set of tools (a visual browser, a text browser, a terminal, and app connectors) and instructed to carry out a user's task end to end rather than only answering in text.[^1][^2][^7] OpenAI describes it as a single "agentic system" that merges the strengths of its two prior agents: it can "think and act, using its own virtual computer to take on complex tasks from start to finish."[^1] In OpenAI's framing, the agent "fluidly shift[s] between reasoning and action," choosing for itself when to browse, when to run code, and when to ask the user for confirmation.[^1][^2]

Unlike a plug-in or a one-off function call, ChatGPT Agent maintains a persistent workspace for the duration of a task: it can log into a site, download a file, process it in a Python terminal, generate a document, and return it to the user, all within one continuous session.[^7][^17][^18] Sessions have been demonstrated running for up to roughly two hours.[^17]

## When was ChatGPT Agent released?

ChatGPT Agent was announced and began rolling out on July 17, 2025, introduced in an OpenAI livestream led by Sam Altman with team members Casey Chu, Isa Fulford, Yash Kumar, and Zhiqing Sun.[^1][^2] Pro subscribers received access by the end of launch day, with Plus and Team following over the next several days and Enterprise and Education "in the coming weeks."[^1][^2] The release consolidated a lineage of 2025 agent products: Operator (January 23, 2025) and Deep Research (February 2, 2025), with the standalone Operator preview retired on August 31, 2025.[^3][^4][^5][^14]

## How did ChatGPT Agent evolve from Operator and Deep Research?

OpenAI's path to ChatGPT Agent passed through two distinct agent products released earlier in 2025.

### Operator (January 2025)

[Operator](/wiki/openai_operator) debuted on January 23, 2025, as an autonomous web-browsing agent built on a new "Computer-Using Agent" (CUA) model that combined the vision capabilities of [GPT-4o](/wiki/gpt_4o) with reasoning from OpenAI's [o-series models](/wiki/o3).[^4][^15] Operator opened a remote Chromium-based browser inside ChatGPT and could click, type, and fill forms in much the same way a human would, "without need[ing] to use developer-facing APIs."[^4][^15] At launch it was restricted to U.S. customers on the $200/month Pro tier and was framed as a "research preview"; OpenAI announced partnerships with DoorDash, eBay, Instacart, Priceline, StubHub, and Uber to ensure that Operator respected those services' terms of use.[^4][^15] Independent assessments later found that Operator reached roughly 38.1% accuracy on OS-level tasks and 58.1% on web tasks, "[not reaching] human-level accuracy" and struggling with multi-step workflows.[^14]

### Deep Research (February 2025)

[Deep Research](/wiki/deep_research) launched on February 2, 2025, as a separate, slower agent that conducted 5-30 minute autonomous browsing sessions to produce long-form, cited research reports.[^5][^16] Built on a variant of [OpenAI o3](/wiki/o3), Deep Research was initially capped at 100 queries per month for Pro users and scored 26.6% on Humanity's Last Exam at release, a substantial gap to GPT-4o (3.3%) and DeepSeek R1 (9.4%).[^5][^16] OpenAI later expanded Deep Research to Plus, Team, Enterprise, and Education users on February 25, 2025, and on April 24, 2025, raised query allocations to 250/month for Pro, 25/month for paid Plus/Team/Enterprise/Edu, and 5/month for Free.[^5][^16]

ChatGPT Agent's system card characterizes the new model as combining "Deep research's ability to conduct multi-step research and generate high-quality reports" with "Operator's capacity to execute tasks through a remote visual browser environment," supplemented with a new terminal and connector layer.[^7] Yash Kumar, the product lead, told The Verge that the move from Operator to Agent meant ChatGPT now had access to "an entire computer" rather than just a browser.[^11]

## How does ChatGPT Agent work?

### The virtual computer

ChatGPT Agent runs inside an OpenAI-controlled sandbox (a virtual machine with persistent state across the duration of a task) rather than on the user's device.[^2][^17] The sandbox exposes four primary tools to the model:[^2][^7]

1. A **visual (graphical) browser** that captures screenshots of pages and lets the model click, scroll, type, and drag, modeled on Operator's CUA;
2. A **text-based browser** optimized for efficient information retrieval, modeled on Deep Research;
3. A **terminal** with Python and shell access, used for data analysis, file manipulation, and generating files such as `.xlsx` and `.pptx` documents;
4. **Direct API access** for selected applications and connectors.

State is preserved across tool switches, so the agent can, for example, log into a website in the visual browser, download a CSV, run it through pandas in the terminal, and then upload the processed file back through the visual browser.[^7][^17][^18] Sessions have been demonstrated lasting up to about two hours, with the agent reasoning, browsing, and acting in a continuous loop.[^17]

### Outputs and file system

Within the sandbox the agent has a working file system. It can read user-uploaded files (PDFs, spreadsheets, images) and produce deliverables such as editable Excel spreadsheets, PowerPoint and Keynote-compatible decks, and PDF research reports written by emitting Python code that constructs the underlying `.xlsx` and `.pptx` files.[^2][^18][^19] Mobile users receive push notifications when long-running tasks complete.[^19]

### Connectors

ChatGPT Agent can call into OpenAI's "connectors" framework, which authenticates the agent to read and (in some cases) write to external services through OAuth or scoped tokens.[^2][^20] Connectors documented as supported at or shortly after launch include Gmail, Google Calendar, Google Drive, GitHub, SharePoint, Outlook, Box, Dropbox, Microsoft Teams, Linear, HubSpot, and (added subsequently) Notion and Slack.[^20][^21] Connectors also leverage the [Model Context Protocol](/wiki/model_context_protocol) (MCP), enabling organizations to expose custom tools to ChatGPT Agent through MCP servers.[^20][^21]

### Scheduled tasks

ChatGPT Agent runs can be scheduled to recur via the "Tasks" feature, allowing daily, weekly, or monthly executions; users can manage these at chatgpt.com/schedules. ChatGPT enforces a cap of 10 active tasks per user.[^21][^22]

## What can ChatGPT Agent do?

The launch announcement and OpenAI's developer cookbook framed ChatGPT Agent as best suited to multi-step "knowledge work" that combines browsing, research, computation, and document production. Examples cited in the launch material and contemporaneous press coverage include:[^1][^2][^11][^19]

- Briefing a user on upcoming client meetings by cross-referencing the user's calendar (via the Google Calendar connector) with recent news.
- Building competitive analyses: researching multiple companies, synthesizing findings into a long-form report, and producing an editable slide deck.
- Generating Japanese-breakfast meal plans, sourcing ingredients via web shopping, and (subject to user confirmation) placing orders.
- Running data-analysis tasks on user-uploaded spreadsheets in the terminal and returning an updated workbook.
- Scheduling appointments, booking travel, and filling out web forms under user supervision.
- For developers and analysts: querying GitHub repositories and Linear issues via connectors, then composing summary documents in the chat.

In its developer documentation OpenAI also published a "workspace agents" cookbook that demonstrated ChatGPT Agent automating sales-meeting prep end-to-end, using connectors to surface CRM context, email threads, and calendar conflicts inside a single agentic workflow.[^20]

## How much does ChatGPT Agent cost, and who can use it?

At its July 17, 2025 launch, ChatGPT Agent was activated by selecting "agent mode" from the tools dropdown in the composer (or by typing the `/agent` shortcut). Rollout proceeded as follows:[^1][^2][^11]

- **Pro ($200/month)**: full access by end of July 17, 2025; quota of **400 agent messages per month**.[^1][^2]
- **Plus ($20/month) and Team**: rolling out over the days following launch; quota of **40 agent messages per month**.[^1][^2]
- **Enterprise and Education**: "coming weeks."[^1][^2]

Additional usage above the monthly quota was made available "for purchase" via credits.[^1][^2]

### Geographic restrictions

ChatGPT Agent was initially unavailable in the European Economic Area (EEA) and Switzerland, mirroring constraints on prior agent products under EU regulatory uncertainty.[^23] OpenAI announced on July 23, 2025 that Pro users in the EEA and Switzerland had been brought into the rollout, with the Plus tier following globally over the next few days; a footnote in OpenAI's Connectors documentation continued to list certain deep-research-style connector features as restricted in the EEA, Switzerland, and the UK.[^23][^24]

## What model powers ChatGPT Agent?

The system card identifies ChatGPT Agent as "a new agentic model in the same family as OpenAI o3" that has been fine-tuned with end-to-end reinforcement learning specifically for browser, terminal, and connector tool use.[^7] Press coverage on the day of launch reported that the model had been referred to internally as a successor to o3 (sometimes loosely called "o4" in pre-release rumors) before being released as a dedicated Agent model rather than a separately branded reasoning model.[^17] OpenAI did not publish an independent name for the model; it is exposed only as the underlying engine of the "agent" mode inside ChatGPT.[^7][^11] Its capabilities on standard reasoning benchmarks were broadly similar to [o3](/wiki/o3) with browsing, with the gains concentrated on agentic tasks that require multi-step tool use.[^7][^17] OpenAI's later ChatGPT release notes indicate that agent mode has continued to inherit newer default models, including the GPT-5.x family, in subsequent versions.[^21]

## How safe is ChatGPT Agent?

The launch of ChatGPT Agent was accompanied by a 41-page system card describing what OpenAI called the most extensive set of pre-deployment mitigations to date, motivated by both the broader user reach of ChatGPT itself and the new attack surface created by the agent's tools.[^7][^8]

### Why was ChatGPT Agent classified "High capability" in biology and chemistry?

OpenAI's [Preparedness Framework](/wiki/openai), updated to version 2 on April 15, 2025, classifies frontier capabilities and prescribes corresponding safeguards. ChatGPT Agent became the first product OpenAI explicitly designated as "High capability in the Biological and Chemical domain," activating the framework's full High-tier mitigation stack.[^7][^8][^9] In the system card, OpenAI wrote: "While we do not have definitive evidence that this model could meaningfully help a novice to create severe biological harm - our defined threshold for High capability - we have chosen to take a precautionary approach."[^7] OpenAI Technical Staff member Boaz Barak said in interviews that "some might think that biorisk is not real ... that may have been true in 2024 but is definitely not true today" and that "it would have been deeply irresponsible to release this model without comprehensive mitigations."[^9]

Specific findings cited as motivation included an external SecureBio assessment in which ChatGPT Agent answered 4 of 10 "World-Class Biology" questions correctly compared with an average of 1.5 of 10 for [o3](/wiki/o3), and a "Pathogen Acquisition" exercise in which the agent bypassed a common failure mode of prior models. The UK AI Safety Institute (AISI) was given pre-launch access and identified seven "universal" attacks, all of which OpenAI patched before the public release.[^7][^25]

OpenAI's High-tier safeguard stack for ChatGPT Agent includes:[^7]

- **Model training** to refuse weaponization assistance and to avoid actionable dual-use detail.
- A **two-tier always-on monitoring system**: a fast topical classifier that flags biology-related content (>= 0.95 recall) and a slower reasoning monitor that classifies generations against a biothreat taxonomy.
- **Account-level enforcement** that bans users attempting to elicit such content, with referral to law enforcement in extreme cases.
- A **Rapid Remediation Protocol** that patches universal jailbreaks discovered in the wild.
- A **bug bounty** for biology-related universal jailbreaks.
- A **trusted-access program** that allows vetted partners in biodefense and life sciences to opt into a less restricted variant while still blocking weaponization content.[^7]

### How does ChatGPT Agent guard against prompt injection and mistakes?

The system card identifies three product-specific risk categories (prompt injection, the agent making a mistake, and users requesting disallowed tasks) and a corresponding mitigation stack:[^7]

- **Safety training** specifically targeted at resisting prompt injections.
- **Automated monitors and filters** updated in real time as new attacks are observed.
- **User confirmations** required before "actions that affect the state of the world" such as purchases or sending email. OpenAI reports a "confirmation recall" of 91.0%, with 100% recall on critical actions including completing financial transactions and editing cloud-storage permissions.[^7]
- **"Watch Mode"** is triggered when the agent uses the visual browser in a sensitive context (for example while logged into email or a banking site). Once enabled, Watch Mode "automatically paus[es] execution when the user becomes inactive or navigates away from the conversation in ChatGPT," requiring the user to actively observe the rest of the trajectory.[^7][^26]
- **Takeover mode** lets users grab the virtual browser to enter credentials themselves; while the user is in control, OpenAI does not capture screenshots, so passwords are never seen by the model.[^26][^27]
- **Terminal network restrictions** at launch limited outbound requests to GET methods against a whitelist of common datasets such as official government datasets.[^7]
- **ChatGPT Memory disabled** while Agent runs, to mitigate prompt-injection-driven exfiltration of stored personal context.[^7]
- A trained **refusal layer** for high-risk tasks such as bank transfers; the system card reports a 97.0% refusal rate for "Disallowed Financial Activities" such as gambling.[^7]

Prompt-injection resistance evaluations in the system card show ChatGPT Agent improving on Operator on three test sets: 95% on "Irrelevant instructions - visual browser" (versus 82% for Operator on GPT-4o and 89% for Operator on o3), 78% on "In-context data exfiltration - visual browser" (versus 75%/80%), and 67% on "Active data exfiltration - visual browser" (versus 58%/75%).[^7]

Sam Altman's pre-launch warning to ChatGPT users explicitly tied these mitigations to remaining uncertainty: "bad actors may try to 'trick' users' AI agents into giving private information they shouldn't and take actions they shouldn't, in ways we can't predict," he wrote, recommending that users give agents "the minimum access required to complete a task."[^12][^13]

## How does ChatGPT Agent perform on benchmarks?

ChatGPT Agent's launch announcement and system card reported state-of-the-art or competitive results on several agentic and reasoning benchmarks. All figures are pass@1 unless otherwise noted; "with tools" indicates the agent had access to its browser and terminal.

| Benchmark | Score | Notes |
|---|---|---|
| [Humanity's Last Exam](/wiki/humanitys_last_exam) | **41.6%** pass@1 (44.4% with parallel attempts) | State of the art at launch; "approximately double" prior best from o3 and o4-mini.[^2][^10][^11] |
| [FrontierMath](/wiki/frontiermath) (Tiers 1-3) | **27.4%** | With tool use, graded by Epoch AI. o4-mini baseline reported at 6.3%.[^2][^11][^28] |
| SpreadsheetBench | **45.5%** (direct editing) | Vs. 20.0% for Copilot in Excel.[^2][^10] |
| DSBench (data analysis) | **89.9%** | Above the average human level reported.[^17][^29] |
| DSBench (data modeling) | **85.5%** | Above the average human level reported.[^17][^29] |
| [BrowseComp](/wiki/browsecomp) | **68.9%** | Substantially above the 51.5% reported for Deep Research at its launch.[^17][^29] |
| [WebArena](/wiki/webarena) | **65.4%** | Marginal gain over o3-based agent baseline (~62.9%).[^17][^29] |
| Investment Banking Modeling Test | **71.3%** mean accuracy | Reported in OpenAI's launch blog; ahead of Deep Research (~55.9%) and o3 (~48.6%).[^17][^29] |
| SWE-bench Verified | comparable to o3 | Software-engineering benchmark; system card notes ChatGPT Agent performed similarly to o3, not better.[^7] |
| PaperBench | comparable to o3 | Replicating ICML 2024 papers; "ChatGPT agent scores are similar to o3's scores, with and without browsing."[^7] |

The system card cautions that some benchmarks are susceptible to "browsing contamination," where the model can retrieve evaluation answers from the open web rather than reason through them. OpenAI reported non-browsing variants where contamination was a concern.[^7]

## How was ChatGPT Agent received?

### Press coverage

Tech press uniformly framed ChatGPT Agent as OpenAI's largest bet to date on agentic AI. TechCrunch described it as "a general-purpose agent in ChatGPT," reporting that "Pro, Plus, and Team subscribers" would gain access on rollout day.[^6] The Verge's Hayden Field led with the model rather than the feature, writing that ChatGPT Agent "can control an entire computer and perform multi-step tasks, powered by a new dedicated model."[^11] VentureBeat highlighted the leap from text answers to actions, framing the product as ChatGPT being given "its own computer to autonomously use your email and web apps, download and create files for you."[^19] Analytics India Magazine emphasized that the rollout fused "Operator's action-taking browser and Deep Research's web synthesis" into a single experience.[^29]

### Practitioner reviews and criticism

Reviewers who tested the product at launch consistently noted that the agent was slow and somewhat unreliable. Nielsen Norman Group titled its first impressions piece "Successful, Shaky, and Slow," reporting that "tasks that take humans five minutes often take longer with the agent" and that small UI elements or dynamic interfaces sometimes confused it.[^30] Trade-press reviews described the agent as "useful for experimentation and occasional tasks but insufficient for critical business processes," and reported that the agent "is unable to bypass Cloudflare/CAPTCHA verifications" and could abandon tasks behind login walls.[^31] WIRED tested the agent's ability to recite WIRED's own product recommendations and reported that "the AI confidently cited specific TVs, headphones, and laptops as WIRED's top picks, none of which the publication's reviewers had actually endorsed."[^32]

Independent observer Simon Willison, the creator of the term "prompt injection," framed ChatGPT Agent as a watershed moment for definitional clarity: "An LLM agent runs tools in a loop to achieve a goal." He noted, however, that browser-using agents combine the "lethal trifecta" of access to private data, exposure to untrusted content, and the ability to communicate externally, and warned that those three properties together make prompt-injection mitigations especially difficult.[^33][^34]

Adoption metrics circulated shortly after launch suggested that the 40-message Plus quota was a binding constraint: one analysis reported that "73% of ChatGPT Plus users burned through their monthly allocation of 40 agent runs within the first week of having access."[^31]

## What are the limitations of ChatGPT Agent?

Independent testing converged on several limitations of ChatGPT Agent at launch:

- **Latency and overhead.** End-to-end runs often took minutes to hours; Nielsen Norman Group reported that "a booking form took 30 seconds for a human but the agent flailed for eight minutes."[^30]
- **CAPTCHA and anti-bot defenses.** Sites with Cloudflare or similar defenses regularly blocked the agent. The agent was reported as more likely to abandon a task than to ask the user for help in such cases.[^31][^32]
- **Hallucination of cited content.** WIRED documented invented "phantom" recommendations in the agent's research outputs, with the agent providing "no citations, no confidence scores, no indication that users should verify its claims" in some interactions.[^32]
- **Quota exhaustion.** With 40 messages per month on Plus and Team, sustained use was difficult; supplemental credits were available for purchase but were not the default experience.[^2][^31]
- **Bio-risk safeguards.** While OpenAI framed the High-capability classification as a precaution, biosecurity commentary noted that ChatGPT Agent's launch was a "first" in the industry; Transformer described the move as OpenAI "hit[ting] the biorisk alarm" and contrasted it with xAI's contemporaneous launch of Grok 4 "without safety documentation or testing disclosures."[^25]
- **Prompt-injection residual risk.** OpenAI itself emphasized in the system card and in subsequent commentary that prompt injection in browser agents is unlikely to be fully eliminated, and Sam Altman publicly cautioned users against entrusting ChatGPT Agent with sensitive personal information at launch.[^7][^12][^13]

## How has ChatGPT Agent evolved since launch?

ChatGPT Agent has continued to evolve through 2025 and 2026:

- **Operator deprecation (August 31, 2025).** The standalone Operator preview at operator.chatgpt.com was shut down, with its visual-browser capabilities folded into ChatGPT Agent.[^3][^14]
- **Enterprise and Education rollout.** Enterprise and Education customers gained access in the weeks following the initial Pro/Plus/Team release.[^21]
- **[AgentKit](/wiki/agentkit) (October 6, 2025).** At DevDay 2025, OpenAI introduced AgentKit, a developer-facing toolkit for building agents from scratch, including a visual "Agent Builder," ChatKit for embedded chat UIs, and an evaluations module, positioning ChatGPT Agent as the consumer-facing counterpart to a broader agent platform.[^35][^36]
- **[Codex](/wiki/openai_codex) and [GPT-5 Codex](/wiki/gpt_5_codex) (October 2025).** OpenAI's coding agent Codex, rebuilt on a GPT-5-family model specialized for agentic coding, became OpenAI's developer-facing analog to ChatGPT Agent for software engineering; OpenAI noted that Codex built "80% of the Agent Builder tool in under 6 weeks."[^36]
- **[ChatGPT Atlas](/wiki/chatgpt_atlas).** OpenAI's ChatGPT-integrated web browser later inherited ChatGPT Agent's agentic mode, exposing it through an "Ask ChatGPT" sidebar and an agentic action mode; OpenAI subsequently published continuous work on hardening Atlas against prompt injection, with leadership stating publicly that prompt injection may never be fully "solved" in browser agents.[^37][^38]
- **Underlying model upgrades.** OpenAI's ChatGPT release notes have since rolled out newer default models (including the GPT-5.x family), with ChatGPT Agent inheriting underlying model improvements in subsequent versions.[^21]

ChatGPT Agent's launch also reframed the competitive landscape for generalist agents: it was widely compared with [Anthropic Computer Use](/wiki/anthropic_computer_use) (released October 22, 2024), [Devin](/wiki/devin), [Manus AI](/wiki/manus_ai), and Google's then-incoming Gemini agentic features, and was the first major release to combine a graphical computer-use agent with a deep-research-style autonomous browsing agent and a Python terminal under a single user-facing mode.[^11][^17]

## See also

- [ChatGPT](/wiki/chatgpt)
- [OpenAI Operator](/wiki/openai_operator)
- [Deep Research](/wiki/deep_research)
- [OpenAI AgentKit](/wiki/agentkit)
- [Codex (OpenAI)](/wiki/openai_codex)
- [ChatGPT Atlas](/wiki/chatgpt_atlas)
- [Anthropic Computer Use](/wiki/anthropic_computer_use)
- [Computer use](/wiki/computer_use)
- [Model Context Protocol](/wiki/model_context_protocol)
- [Agentic AI](/wiki/agentic_ai)
- [AI agents](/wiki/ai_agents)
- [Humanity's Last Exam](/wiki/humanitys_last_exam)
- [BrowseComp](/wiki/browsecomp)
- [WebArena](/wiki/webarena)
- [FrontierMath](/wiki/frontiermath)
- [GAIA benchmark](/wiki/gaia_benchmark)
- [Sam Altman](/wiki/sam_altman)

## References

[^1]: OpenAI. "Introducing ChatGPT agent: bridging research and action." July 17, 2025. https://openai.com/index/introducing-chatgpt-agent/
[^2]: OpenAI livestream and launch announcement, "Introduction to ChatGPT agent," featuring Sam Altman, Casey Chu, Isa Fulford, Yash Kumar, and Zhiqing Sun, July 17, 2025. https://x.com/AILeaksAndNews/status/1945877310873407695
[^3]: OpenAI. "ChatGPT agent: release notes." OpenAI Help Center. https://help.openai.com/en/articles/11794368-chatgpt-agent-release-notes
[^4]: OpenAI. "Introducing Operator." January 23, 2025. https://openai.com/index/introducing-operator/
[^5]: OpenAI. "Introducing deep research." February 2, 2025. https://openai.com/index/introducing-deep-research/
[^6]: Maxwell Zeff. "OpenAI launches a general purpose agent in ChatGPT." TechCrunch, July 17, 2025. https://techcrunch.com/2025/07/17/openai-launches-a-general-purpose-agent-in-chatgpt/
[^7]: OpenAI. "ChatGPT Agent System Card." July 17, 2025. https://cdn.openai.com/pdf/839e66fc-602c-48bf-81d3-b21eacc3459d/chatgpt_agent_system_card.pdf
[^8]: OpenAI. "ChatGPT agent System Card." Index page. https://openai.com/index/chatgpt-agent-system-card/
[^9]: Bea Nolan. "OpenAI warns that its new ChatGPT Agent has the ability to aid dangerous bioweapon development." Fortune, July 18, 2025. https://fortune.com/2025/07/18/openai-chatgpt-agent-could-aid-dangerous-bioweapon-development/
[^10]: OpenAI. ChatGPT Agent benchmark figures as reported on the launch blog (Humanity's Last Exam pass@1 41.6%; SpreadsheetBench 45.5% vs. 20.0% Copilot in Excel; FrontierMath 27.4% with tool use). https://openai.com/index/introducing-chatgpt-agent/
[^11]: Hayden Field. "OpenAI debuts ChatGPT Agent, which can control an entire computer and perform multi-step tasks, powered by a new dedicated model." The Verge, July 17, 2025, summarized via Techmeme. https://www.techmeme.com/250717/p38
[^12]: Sam Altman, X (Twitter) post on ChatGPT Agent launch, July 17, 2025, as quoted in The Decoder: "OpenAI CEO Sam Altman warns users not to trust ChatGPT agent with sensitive or personal data." https://the-decoder.com/openai-ceo-sam-altman-warns-users-not-to-trust-chatgpt-agent-with-sensitive-or-personal-data/
[^13]: "OpenAI has introduced ChatGPT Agent; Sam Altman warns of risks." Born's Tech and Windows World, July 24, 2025. https://borncity.com/win/2025/07/24/openai-has-introduced-chatgpt-agent-sam-altman-warns-of-risks/
[^14]: Wikipedia. "OpenAI Operator." https://en.wikipedia.org/wiki/OpenAI_Operator
[^15]: Kyle Wiggers. "OpenAI launches Operator, an AI agent that performs tasks autonomously." TechCrunch, January 23, 2025. https://techcrunch.com/2025/01/23/openai-launches-operator-an-ai-agent-that-performs-tasks-autonomously/
[^16]: Wikipedia. "ChatGPT Deep Research." https://en.wikipedia.org/wiki/ChatGPT_Deep_Research
[^17]: AINews. "ChatGPT Agent: new o* model + unified Deep Research browser + Operator computer use + Code Interpreter terminal." July 17, 2025. https://news.smol.ai/frozen-issues/25-07-17-chatgpt-agent.html
[^18]: InfoQ. "OpenAI Announces Generalist ChatGPT Agent to Take on Excel, PowerPoint, and Chrome." July 2025. https://www.infoq.com/news/2025/07/openai-chatgpt-agents/
[^19]: VentureBeat. "OpenAI unveils 'ChatGPT agent' that gives ChatGPT its own computer to autonomously use your email and web apps, download and create files for you." July 17, 2025. https://venturebeat.com/ai/openai-unveils-chatgpt-agent-that-gives-chatgpt-its-own-computer-to-autonomously-use-your-email-and-web-apps-download-and-create-files-for-you
[^20]: OpenAI. "ChatGPT Business: Release Notes." https://help.openai.com/en/articles/11391654-chatgpt-business-release-notes
[^21]: OpenAI. "ChatGPT: Release Notes." https://help.openai.com/en/articles/6825453-chatgpt-release-notes
[^22]: OpenAI. "Tasks in ChatGPT." https://help.openai.com/en/articles/10291617-scheduled-tasks-in-chatgpt
[^23]: DataStudios. "ChatGPT Agent appears in Europe despite no formal announcement." https://www.datastudios.org/post/chatgpt-agent-appears-in-europe-despite-no-formal-announcement
[^24]: OpenAI, official X post: "Pro users in the European Economic Area and Switzerland, thanks for your patience. ChatGPT agent is now fully rolled out to you. And the rollout to Plus users globally has started and will continue over the next few days." July 23, 2025. https://x.com/OpenAI/status/1947882931294507263
[^25]: Transformer News. "OpenAI hits the biorisk alarm with ChatGPT Agent." July 2025. https://www.transformernews.ai/p/openai-hits-biorisk-alarm-chatgpt-agent
[^26]: OpenAI Deployment Safety Hub. "ChatGPT Agent System Card: Watch Mode." https://deploymentsafety.openai.com/chatgpt-agent/watch-mode
[^27]: OpenAI Help Center. "ChatGPT agent." https://help.openai.com/en/articles/11752874-chatgpt-agent
[^28]: Epoch AI Research, X post. "We have graded the results of @OpenAI's evaluation on FrontierMath Tier 1-3 questions, and found a 27% (+/- 3%) performance." July 17, 2025. https://x.com/EpochAIResearch/status/1945905793666023703
[^29]: Analytics India Magazine. "OpenAI Rolls Out ChatGPT Agent Combining Deep Research and Operator." July 2025. https://analyticsindiamag.com/ai-news-updates/openai-rolls-out-chatgpt-agent-combining-deep-research-and-operator/
[^30]: Nielsen Norman Group. "Initial Impressions of ChatGPT's Agent: Successful, Shaky, and Slow." 2025. https://www.nngroup.com/articles/impressions-chatgpt-agent/
[^31]: NinetyTwoThree. "ChatGPT Agent Mode: Use Cases, Limits, and Business Value." 2025. https://www.ninetwothree.co/blog/chatgpt-agent-mode
[^32]: WIRED, "I Asked ChatGPT What WIRED's Reviewers Recommend. Its Answers Were All Wrong," summarized via Spencer Greenhalgh. 2026. https://spencergreenhalgh.com/communities/2026-04-01-interesting-article/
[^33]: Simon Willison. "I think 'agent' may finally have a widely enough agreed upon definition to be useful jargon now." Simon Willison's Newsletter. https://simonw.substack.com/p/i-think-agent-may-finally-have-a
[^34]: Simon Willison. "The lethal trifecta for AI agents." Simon Willison's Newsletter. https://simonw.substack.com/p/the-lethal-trifecta-for-ai-agents
[^35]: OpenAI. "Introducing AgentKit." October 6, 2025. https://openai.com/index/introducing-agentkit/
[^36]: The Neuron. "Everything OpenAI Released on DevDay 2025, Explained." October 2025. https://www.theneuron.ai/explainer-articles/everything-openai-released-on-devday-2025-explained/
[^37]: TechCrunch. "OpenAI says AI browsers may always be vulnerable to prompt injection attacks." December 22, 2025. https://techcrunch.com/2025/12/22/openai-says-ai-browsers-may-always-be-vulnerable-to-prompt-injection-attacks/
[^38]: OpenAI. "Continuously hardening ChatGPT Atlas against prompt injection attacks." https://openai.com/index/hardening-atlas-against-prompt-injection/

