GPT-5.4
Last reviewed
May 7, 2026
Sources
27 citations
Review status
Source-backed
Revision
v3 ยท 4,741 words
Improve this article
Add missing citations, update stale details, or suggest a clearer explanation.
Last reviewed
May 7, 2026
Sources
27 citations
Review status
Source-backed
Revision
v3 ยท 4,741 words
Add missing citations, update stale details, or suggest a clearer explanation.
GPT-5.4 is a large language model developed by OpenAI and released on March 5, 2026. It is the fourth point release in the GPT-5 series, following GPT-5.3 (the Codex line and GPT-5.3 Instant) and preceding GPT-5.5. The release marked the first time OpenAI shipped a single mainline model with native computer use, a 1 million token context window, and the coding capability that had previously been confined to its specialist Codex variants.
GPT-5.4 launched in two reasoning configurations at debut, GPT-5.4 Thinking and GPT-5.4 Pro, with two smaller siblings (GPT-5.4 mini and GPT-5.4 nano) following on March 17, 2026. A defensive cybersecurity variant, GPT-5.4-Cyber, was added later in April through OpenAI's Trusted Access for Cyber program. The model headlined a unified release across ChatGPT, the OpenAI API, and Codex surfaces, and it became the default "Thinking" tier in ChatGPT for paid plans on launch day.
OpenAI positioned GPT-5.4 as the first frontier model that combines coding, knowledge work, computer use, and long context in one general-purpose system. On OSWorld-Verified, a computer-use evaluation that measures how well models can drive desktop applications, GPT-5.4 scored 75.0%, exceeding the 72.4% average human baseline reported by OpenAI. On the GDPval professional knowledge work benchmark spanning 44 occupations, the model reached 83%. It also took the top Elo position on LM Arena for several weeks following launch. Reception in the developer community was generally positive on capability, with criticism focused on price increases relative to predecessors and a 207-second average time to first token in reasoning mode reported by Artificial Analysis.
GPT-5.4 arrived nine months after the original GPT-5 launch in August 2025. By early 2026 OpenAI had iterated rapidly: GPT-5.1 refined instruction following, GPT-5.2 introduced the three-tier Instant/Thinking/Pro structure on December 11, 2025, and the GPT-5.3 cycle in February and March 2026 produced GPT-5.3-Codex, GPT-5.3-Codex-Spark (with Cerebras), and GPT-5.3 Instant. By the time GPT-5.4 was being prepared, OpenAI's lineup had bifurcated into a conversational track (the Instant series) and a specialist coding track (the Codex series), each with separate price points and rollout schedules.
The positioning of GPT-5.4 reversed that trend. OpenAI described it as the model that "brings together our advances in reasoning, coding, and agentic workflows into a single frontier model," folding the gains of GPT-5.3-Codex back into the mainline rather than maintaining a parallel Codex specialist. Brendan Foody, CEO of evaluation platform Mercor, said the model delivered "top performance while running faster and at a lower cost than competitive frontier models" on Mercor's APEX-Agents benchmark for law and finance.
The competitive context was a crowded frontier. Anthropic had released Claude Opus 4.7 (1M context) in early 2026 with its own 1 million token context window. Google's Gemini 3 family held a substantial share of long-context workloads through Vertex AI and offered cheaper input pricing. xAI, DeepSeek, and Meta had all shipped frontier or near-frontier models in the same window. OpenAI's pitch with GPT-5.4 was that the same model could now handle the long-context, computer-use, and agentic workflows that had previously required specialist tools or competing providers.
GPT-5.4 launched on Thursday, March 5, 2026 across ChatGPT, the API, and Codex. OpenAI published the announcement "Introducing GPT-5.4" on its blog the same day, and the model became available immediately to paid ChatGPT users (Plus, Pro, Team, Business, Enterprise, and Edu) as a Thinking option. Free-tier users did not receive direct access to the base GPT-5.4 model at launch, although the smaller GPT-5.4 mini reached free users through the Thinking feature when it shipped on March 17, 2026.
In the API, GPT-5.4 was exposed under the model ID gpt-5.4 with a dated snapshot gpt-5.4-2026-03-05 for reproducible research. The Pro variant, which can spend longer in extended reasoning and is intended for high-stakes work, was exposed as gpt-5.4-pro. OpenAI also published an updated system card addendum tied to the existing GPT-5 system card line, classifying GPT-5.4 Thinking as High capability in the Cybersecurity domain under the Preparedness Framework while remaining below the Critical threshold.
The release was synchronized with several first-party surface updates. ChatGPT's web client added an upfront thinking plan preview that showed users what the model intended to do before it executed. Codex CLI and the IDE extensions for Visual Studio Code and JetBrains added GPT-5.4 as a selectable model. GitHub Copilot integrated the model for enterprise customers under bring-your-own-key arrangements. Microsoft made GPT-5.4 available on Azure AI Foundry on the same day.
GPT-5.2 was scheduled for retirement on June 5, 2026, three months after the launch, in line with OpenAI's standard transition policy for prior-generation models. GPT-5.3-Codex remained available but was effectively superseded as a recommended option, with Codex documentation pointing developers toward GPT-5.4 for new projects.
GPT-5.4 launched with two reasoning configurations and added small models, computer-use, and a cyber-permissive variant in subsequent weeks.
| Variant | API model ID | Snapshot | Released | Primary use case |
|---|---|---|---|---|
| GPT-5.4 Thinking | gpt-5.4 | gpt-5.4-2026-03-05 | March 5, 2026 | Mainline reasoning across professional work, coding, and computer use |
| GPT-5.4 Pro | gpt-5.4-pro | not separately dated | March 5, 2026 | Maximum accuracy on long-running professional tasks |
| GPT-5.4 mini | gpt-5.4-mini | gpt-5.4-mini-2026-03-17 | March 17, 2026 | Fast, low-cost coding, computer use, and subagents |
| GPT-5.4 nano | gpt-5.4-nano | gpt-5.4-nano-2026-03-17 | March 17, 2026 | Classification, extraction, ranking, and high-volume sub-agents |
| GPT-5.4-Cyber | restricted | not public | Late April 2026 | Defensive cybersecurity work for verified professionals |
GPT-5.4 Thinking is the mainline reasoning model. It supports adjustable reasoning effort (low, medium, high, and xhigh), accepts text and image input, and produces text output. It is the version surfaced in ChatGPT's Thinking tier for paid users, in Codex CLI, and on most third-party platforms. OpenAI describes it as its "most capable and efficient frontier model for professional work," with particular emphasis on long-horizon deliverables such as slide decks, financial models, and legal analysis.
Thinking inherits the developer tools from earlier GPT-5 releases (the apply_patch tool for structured file diffs, local_shell for sandboxed shell execution, preambles for persistent agent instructions) and adds two new ones at launch: native computer use, which lets the model take screenshots, move a mouse, and type into desktop applications, and tool search, which dynamically loads tool definitions only when needed instead of front-loading them in every prompt.
GPT-5.4 Pro is the higher-end reasoning configuration. It uses extended reasoning by default and is exposed only through the Responses API. OpenAI positions Pro for tasks where additional inference compute reliably improves outcomes, such as legal due diligence, complex financial modeling, frontier scientific reasoning, and difficult coding tasks that require deeper deliberation. On Humanity's Last Exam Pro reaches 41.6%, second only to Gemini 3.1 Pro Preview at 44.7% according to Artificial Analysis tracking through May 2026.
Pro carries the same 1.05M-token context window as Thinking but a much steeper price tag: $30 per million input tokens and $180 per million output tokens, twelvefold the standard tier. The variant is available to ChatGPT Pro and Enterprise subscribers and through the API for all customers.
OpenAI added GPT-5.4 mini and GPT-5.4 nano on March 17, 2026, in a follow-up post titled "Introducing GPT-5.4 mini and nano." Mini retains computer use and tool search but trims the context window to 400,000 tokens. It is OpenAI's recommended option for high-volume agent and subagent workloads where the standard Thinking model is too expensive. In ChatGPT, mini is available to Free and Go users as the Thinking option, giving free-tier users their first access to a GPT-5.4 derivative.
Nano is the smaller of the two and is API-only. Computer use and tool search are not supported. It is targeted at high-volume classification, data extraction, ranking, and short-scope sub-agent work where speed and cost dominate the budget. Pricing for the small variants is substantially below the mainline tier.
GPT-5.4-Cyber is a fine-tune of GPT-5.4 Thinking trained to support legitimate defensive cybersecurity work. OpenAI announced it in late April 2026, alongside an expanded Trusted Access for Cyber (TAC) program. The variant lowers the refusal boundary for tasks that the standard model would treat as borderline (vulnerability research, malware analysis, binary reverse engineering) and adds capabilities such as binary reverse engineering of compiled software without source code access. Access is restricted to identity-verified individuals and enterprise security teams, with optional Zero-Data-Retention waivers that allow OpenAI monitoring in exchange for higher permissiveness.
| Specification | GPT-5.4 Thinking | GPT-5.4 Pro | GPT-5.4 mini | GPT-5.4 nano |
|---|---|---|---|---|
| Context window | 1,050,000 tokens | 1,050,000 tokens | 400,000 tokens | 400,000 tokens |
| Maximum output tokens | 128,000 | 128,000 | 128,000 | 128,000 |
| Knowledge cutoff | August 31, 2025 | August 31, 2025 | August 31, 2025 | August 31, 2025 |
| Input modalities | Text, images | Text, images | Text, images | Text, images |
| Output modalities | Text | Text | Text | Text |
| Audio support | None | None | None | None |
| Reasoning effort | none, low, medium, high, xhigh | extended | low, medium, high | low, medium |
| Native computer use | Yes | Yes | Yes | No |
| Tool search | Yes | Yes | Yes | No |
| Function calling | Yes | Yes | Yes | Yes |
| Structured outputs | Yes | Yes | Yes | Yes |
| Streaming | Yes | Yes | Yes | Yes |
| Fine-tuning | No | No | No | No |
The context window is OpenAI's first to cross the 1 million-token threshold for a mainline general-purpose model, surpassing the 400K tokens used by GPT-5.2 Thinking and matching what Anthropic and Google offered in their highest-tier 2026 models. The 128,000-token output cap is unchanged from the GPT-5 series, though it can be combined with the larger context window to support longer overall sessions through compaction. The August 31, 2025 knowledge cutoff is identical to GPT-5.2 and GPT-5.3, meaning GPT-5.4 has no awareness of post-cutoff events without retrieval or web search tools enabled.
Native computer use is the most novel addition. GPT-5.4 can take screenshots of a desktop, identify UI elements, move a virtual mouse, type into windows, and chain operations across applications. The model was trained jointly on the desktop interaction data that previously powered OpenAI's Computer Using Agent (CUA) line, and the OSWorld-Verified score of 75% indicates that its first-attempt completion rate on a curated set of 369 desktop tasks crossed the human reference baseline of 72.4%. On a private OpenAI evaluation involving roughly 30,000 HOA and property tax portals, GPT-5.4 reportedly achieved 95% first-attempt success and 100% within three attempts.
Tool search is the second major addition. Rather than including the full schema for every available tool in every API call, the model receives a lightweight catalog and queries it on demand. OpenAI reported a 47% reduction in token usage on benchmarks involving 36 MCP servers without loss of accuracy. The mechanism preserves the prompt cache better and scales to larger tool ecosystems than the prior "all tools in the prompt" approach.
OpenAI published a wide set of benchmark scores at launch. Third-party trackers such as Artificial Analysis, Vellum, llm-stats.com, and DataCamp confirmed many of them, with the usual caveats that some scores reflect specific reasoning effort settings or tool configurations.
| Benchmark | GPT-5.4 Thinking | GPT-5.4 Pro | GPT-5.3-Codex | GPT-5.2 Thinking |
|---|---|---|---|---|
| GDPval (professional knowledge work) | 83.0% | not separately reported | not reported | 70.9% |
| OSWorld-Verified (computer use) | 75.0% | not separately reported | 64.7% | 47.3% |
| WebArena Verified | 67.3% | not separately reported | not reported | not reported |
| Online-Mind2Web | 92.8% | not separately reported | not reported | not reported |
| BrowseComp | 82.7% | 89.3% | not reported | not reported |
| Toolathlon | 54.6% | not separately reported | not reported | not reported |
| SWE-bench Verified | ~80% | ~80% | ~80% | 80.0% |
| SWE-bench Pro Public | 57.7% | not separately reported | 56.8% | 55.6% |
| Terminal-Bench 2.0 | 75.0% | not separately reported | 77.3% | 62.2% |
| GPQA Diamond | reported in 91 to 92% range | 94.4% | not reported | 92.4% |
| AIME 2025 | 100% | 100% | 100% | 100% |
| FrontierMath (Tiers 1 to 3) | 47.6% | not separately reported | not reported | 40.3% |
| FrontierMath Tier 4 | not reported | 38.0% | not reported | not reported |
| Humanity's Last Exam (with tools) | 52.1% | not reported | not reported | 34.5% |
| Humanity's Last Exam (Artificial Analysis) | not reported | 41.6% | not reported | not reported |
| ARC-AGI-1 | 93.7% | not separately reported | not reported | ~88% |
| ARC-AGI-2 | 73.3% | not separately reported | not reported | 52.9% |
The most prominent gains versus GPT-5.2 are on computer use and abstract reasoning. OSWorld-Verified rose from 47.3% to 75.0%, a 27.7 percentage-point jump and the largest single-generation gain on that benchmark. ARC-AGI-2 climbed from 52.9% to 73.3%, continuing the rapid progression that had begun with GPT-5.2. Knowledge work as measured by GDPval rose from 70.9% to 83.0%, with OpenAI noting that the gain was strongest on document-heavy occupations such as legal, financial, and project-management roles.
GPT-5.4 absorbed the coding capability of GPT-5.3-Codex without surpassing it on every coding metric. SWE-bench Pro Public improved by 0.9 points (56.8% to 57.7%), and SWE-bench Verified stayed near the 80% range. Terminal-Bench 2.0 actually fell from 77.3% in GPT-5.3-Codex to 75.0% in GPT-5.4, reflecting the breadth-versus-specialization tradeoff of merging the Codex line back into the mainline. OpenAI's argument was that GPT-5.4's broader scope (long context, computer use, knowledge work, and tool search) made up for the marginal regression on terminal-specific tasks. Independent reviewers including Nathan Lambert (Interconnects) described the upgrade as "a meaningful step" in practice across correctness, ease of use, speed, and cost, even where on-paper benchmarks looked incremental.
On an internal OpenAI benchmark of spreadsheet modeling tasks of the kind a junior investment-banking analyst might perform, GPT-5.4 scored 87.3% mean compared to 68.4% for GPT-5.2. Human raters preferred GPT-5.4 presentations to GPT-5.2 presentations 68.0% of the time. On legal document analysis (BigLaw Bench), GPT-5.4 reached 91% per third-party reporting.
OpenAI reported that individual factual claims in GPT-5.4 outputs are 33% less likely to be wrong than in GPT-5.2, and overall responses are 18% less likely to contain any factual errors. The largest hallucination reductions were on web-enabled queries and on professional domains such as legal, medical, and financial, in line with the trend across earlier GPT-5 point releases.
On LM Arena, GPT-5.4 took the #1 Elo position on March 6, 2026, the day after launch. Through April 2026, GPT-5.4-high held a top-five position with an Elo around 1,480. The standard GPT-5.4 entry tracked at around Elo 1,466, ranking in the top 20. On Artificial Analysis's Intelligence Index it scored 57, tied with Gemini 3.1 Pro and ahead of Claude Opus 4.6 at 53. Time to first token in reasoning mode averaged 207 seconds, and throughput averaged 80.3 tokens per second across 120 million evaluated output tokens.
| Tier | Input (per 1M tokens) | Cached input (per 1M) | Output (per 1M tokens) |
|---|---|---|---|
| GPT-5.4 Thinking | $2.50 | $0.25 | $15.00 |
| GPT-5.4 Pro | $30.00 | not published | $180.00 |
| GPT-5.4 mini | $0.75 | $0.075 | $4.50 |
| GPT-5.4 nano | $0.20 | $0.02 | $1.25 |
| GPT-5.2 Thinking (prior tier reference) | $1.75 | $0.175 | $14.00 |
| GPT-5.3-Codex (prior reference) | $1.75 | $0.175 | $14.00 |
The Thinking tier reflects a 43% increase in input price and a 7% increase in output price relative to GPT-5.2 Thinking. Cached input retains the 90% discount. Batch API pricing applies the standard 50% discount, and OpenAI's regional data residency endpoints add a 10% surcharge. Some reviewers, including The Decoder, characterized GPT-5.4 mini and nano as roughly four times more expensive than equivalent GPT-5.0 small models, citing this as a friction point for high-volume workloads.
GPT-5.4 Pro pricing puts it among the most expensive models on the market. Artificial Analysis ranked it 143 out of 145 models on input price and 144 out of 145 on output price, comparable in tier to OpenAI's prior Pro releases.
| Plan | GPT-5.4 Thinking | GPT-5.4 Pro | GPT-5.4 mini |
|---|---|---|---|
| Free | No | No | Yes (Thinking option, rate limited) |
| Go | No | No | Yes |
| Plus ($20/month) | Yes (80 messages per 3 hours) | No | Yes |
| Pro ($200/month) | Yes (unlimited) | Yes (unlimited) | Yes |
| Team / Business | Yes | No | Yes |
| Enterprise / Edu | Yes (admin opt-in) | Yes (admin opt-in) | Yes |
Free users initially had no access to GPT-5.4 itself, only the GPT-5.3 Instant model. The mini-tier addition on March 17 closed that gap by exposing GPT-5.4 mini through the Thinking feature on the free plan, although with rate limits.
GPT-5.4's most-discussed feature is native computer use. OpenAI's announcement framed it as a step from chat-based assistance toward true desktop agency: the model can open applications, navigate menus, fill in spreadsheets, click through web forms, and chain operations without bespoke automation scripts. Common patterns include scraping HOA portals for property data, running multi-step forms across legacy government websites, populating Excel workbooks from PDF inputs, and driving SaaS dashboards through their UI rather than their API.
The GDPval score of 83% reflects performance across 44 distinct professional categories, including financial analysis, legal drafting, medical chart abstraction, project planning, and engineering documentation. OpenAI's example workflows include producing a sales presentation, generating an accounting spreadsheet, scheduling staff for an urgent care clinic, drawing manufacturing diagrams, and producing short videos. The model is positioned for tasks that combine reading large documents, reasoning over them, drafting structured artifacts, and using tools to verify or refine the output.
With the merge of GPT-5.3-Codex's coding capability, GPT-5.4 became the recommended coding option in Codex CLI and the major IDE integrations. Reviewers noted improvements in long-running coding sessions: better context management, fewer "context wall" failures on million-token repositories, more reliable git operations, and reduced regression to previously solved problems. Cursor's evaluation data, cited by Nathan Lambert, showed efficiency gains in tokens per task. GPT-5.3-Codex retained narrow advantages in highly terminal-focused workflows where its specialist tuning still mattered.
The combination of long context, computer use, tool search, and the existing developer tools (apply_patch, local_shell, preambles) makes GPT-5.4 OpenAI's first model that can sustain agentic workflows over millions of tokens and many tools without specialist routing. Tool search reportedly cuts token usage on tool-heavy workflows by roughly half. Mid-response interactive thinking lets users redirect the model partway through a long reasoning trajectory rather than waiting for the entire run to finish.
OpenAI emphasized professional artifacts in its launch demos: financial models, legal memos, board presentation decks, and multi-tab spreadsheets. The 87.3% spreadsheet score and the 68.0% presentation-preference rate are the headline numbers. Several enterprise customers including Notion, Box, and Mercor cited workflow improvements on internal benchmarks.
At the time of GPT-5.4's release, the primary competing frontier models were Claude Opus 4.6 and Claude Opus 4.7 (1M context) from Anthropic, Gemini 3.1 Pro from Google, and a small number of open-weights releases. The table below uses third-party benchmark numbers as reported in the months after launch.
| Benchmark | GPT-5.4 Thinking | Claude Opus 4.7 (1M context) | Gemini 3.1 Pro |
|---|---|---|---|
| Context window | 1.05M tokens | 1M tokens | 1M to 2M tokens |
| OSWorld-Verified | 75.0% | 72.5% | not directly reported |
| GDPval | 83.0% | reported lower | not directly reported |
| SWE-bench Verified | ~80% | 80.8% | 76% range |
| Humanity's Last Exam (Artificial Analysis) | not directly reported | reported lower | 44.7% (preview) |
| Artificial Analysis Intelligence Index | 57 | 53 | 57 |
| Input price (per 1M tokens) | $2.50 | $3 to $5 range | $2 range |
| Output price (per 1M tokens) | $15.00 | $15 to $25 range | $12 range |
GPT-5.4 led on computer use and knowledge work, sat near the top of the SWE-bench Verified pack, and was competitive on price for a frontier reasoning model. Claude Opus 4.6 retained narrow advantages on selected SWE-bench measurements and on tasks with ambiguous specifications, where reviewers found it inferred developer intent more reliably. Gemini 3.1 Pro held the top score on Humanity's Last Exam in the preview phase and offered cheaper input pricing along with better integration with Google Cloud.
The split between GPT-5.4 and competitors is qualitatively similar to the GPT-5.2 era: OpenAI's model is faster and more precise on well-specified tasks; Claude is more reliable on ambiguous ones; Gemini is strongest on long-context and Google-ecosystem workflows. GPT-5.4 closed the context-window gap that had persisted since the GPT-5 launch.
GPT-5.4 Thinking was the second model OpenAI classified as High capability for cybersecurity under the Preparedness Framework, after GPT-5.3-Codex earlier in 2026. The classification triggered the same set of layered safeguards that had accompanied the GPT-5.3-Codex launch: refusal training on clearly malicious requests, automated classifier-based monitoring of high-risk traffic, a fallback model for traffic flagged as suspicious, and the gated Trusted Access for Cyber program for advanced capabilities.
The High threshold under the framework is defined as a model that "removes existing bottlenecks to scaling cyber operations" through automation of end-to-end attacks against hardened targets or automation of operationally relevant vulnerability discovery and exploitation. OpenAI did not claim definitive evidence that GPT-5.4 reaches this threshold but adopted a precautionary approach as it had with prior Codex releases. The Critical threshold (zero-day discovery in many hardened systems without human intervention, or end-to-end novel cyberattack strategy execution) was not met.
OpenAI also reported safety improvements for ordinary user interactions. Hallucination rates on representative ChatGPT traffic stayed under 1% per claim with browsing enabled, continuing the trend established by GPT-5.2 and GPT-5.3 Instant. Mental health handling and self-harm response benchmarks improved relative to GPT-5.2.
GPT-5.4-Cyber, released in late April, formalized the cyber-permissive variant for verified defensive professionals. It includes binary reverse engineering capability, lower refusal rates on legitimate vulnerability research, and a tiered access model with identity verification, enterprise authentication, and Zero-Data-Retention waivers for the highest permissiveness tier.
Developer reception was largely positive on capability. Reviewers from Turing College, NxCode, BuildFastWithAI, and Interconnects characterized GPT-5.4 as a meaningful step forward, particularly on long sessions where context-window pressure had been a persistent bottleneck for the GPT-5.2 generation. Cursor's data showed efficiency gains in tokens per coding task. Mercor named GPT-5.4 the leader on its APEX-Agents law and finance benchmark.
Nathan Lambert summarized the trajectory: "Where GPT 5.4 feels like another incremental model on some on-paper benchmarks, in practice it feels like a meaningful step" across correctness, ease of use, speed, and cost. He praised the elimination of the "death by a thousand cuts" experience around git operations and background package management that had plagued earlier Codex variants.
Enterprise customers cited workflow improvements. The 95% first-attempt success on HOA portals was repeatedly highlighted by adopters in property tech and finance. Box, Notion, and Mercor reported productivity gains on internal benchmarks. Microsoft's day-zero Azure AI Foundry availability gave enterprise users a fast path to deployment.
Criticism focused on three areas. First, pricing: the 43% input-price increase over GPT-5.2 Thinking, and the four-fold cost increase for mini and nano relative to the GPT-5.0 small models, drew complaints from cost-sensitive teams. The Decoder ran a critical piece on the small-model price increases.
Second, latency: Artificial Analysis measured a 207-second average time to first token in reasoning mode. While this is partly inherent to extended reasoning, several reviewers noted that GPT-5.4 felt slower in interactive use than GPT-5.2 Thinking on equivalent prompts.
Third, mixed retention of Codex specialization: the Terminal-Bench 2.0 regression from 77.3% to 75.0% relative to GPT-5.3-Codex prompted some terminal-heavy teams to keep GPT-5.3-Codex as their primary coding model. GPT-5.4's broader scope was seen as a tradeoff against the highly tuned specialist model.
Mainstream coverage in TechCrunch, ZDNET, Vice, Wired, and CNBC framed GPT-5.4 as office automation infrastructure rather than a chatbot upgrade, in particular through the Excel and presentation integrations and the promise of fewer iteration loops on professional artifacts. The Decoder and Vice highlighted token efficiency and deep research. Some Reddit threads on r/MachineLearning and Hacker News raised data-handling concerns; a small "soft boycott" of frontier closed models in favor of local options like Llama 3.x was reported by AI Critique and other outlets, though it did not register as a measurable shift in market share.
GPT-5.4 was succeeded by GPT-5.5, released April 23, 2026, just under seven weeks after the GPT-5.4 launch. GPT-5.5 carried forward the 1M-token context window and computer-use capability, increased token efficiency, raised intelligence-index scores further, and added new agentic features. GPT-5.5 was priced higher than GPT-5.4 in the API but with lower effective per-task cost on many workloads due to fewer tokens consumed. GPT-5.5 Instant followed on May 5, 2026 as the default model in ChatGPT.
GPT-5.4 was scheduled to remain available in the API through at least the end of 2026 in line with OpenAI's standard transition policy.