OpenAI o3-pro

Large Language Models OpenAI Reasoning Models

19 min read

Updated Jul 16, 2026

Suggest edit History Talk

RawGraph

Last edited

Jul 16, 2026

Fact-checked

In review queue

Sources

33 citations

Revision

v4 · 3,800 words

Fact-checks are independent of edits: a reviewer re-verifies the article against its sources and stamps the date. How we verify

OpenAI o3-pro is a high-compute reasoning large language model released by OpenAI on June 10, 2025, designed as the professional, higher-reliability variant of the company's o3 reasoning model.^[1]^[2] Like its predecessor o1 pro mode, o3-pro is a version of OpenAI's flagship reasoning model that is configured to "think longer" by allocating substantially more test-time compute to each query, with the explicit design goal of producing more reliable answers on difficult problems in mathematics, science, programming, and other domains where reviewers were observed to prefer the high-compute variant.^[1]^[3]^[4] At launch, o3-pro became available immediately to ChatGPT Pro and Team subscribers, with ChatGPT Enterprise and Education tiers gaining access one week later, and was simultaneously released in the OpenAI Responses API at a price of US$20 per million input tokens and US$80 per million output tokens, an 87% reduction relative to the prior o1-pro API pricing.^[1]^[2]^[5]^[6]

The launch of o3-pro was paired with an unrelated but simultaneous 80% price cut to the standard o3 model in the API, bringing baseline o3 down from US$10 / US$40 per million tokens to US$2 / US$8 per million tokens (input/output).^[5]^[6]^[7] OpenAI described the o3 price reduction as a result of inference-stack optimization without any change to the underlying model.^[5] The combined announcements, a substantially cheaper baseline o3 plus a higher-compute o3-pro tier, were widely characterized by analysts as part of OpenAI's strategy to widen its commercial lead in the reasoning model category.^[7]

In the consumer ChatGPT product, o3-pro replaced o1 pro mode in the model picker for Pro and Team users; the model remained the most capable reasoning option available through the ChatGPT interface until OpenAI rolled out GPT-5 in August 2025, at which point o3-pro chats in the consumer product were automatically migrated to a new "GPT-5 Pro" mode and the o3-pro name was removed from the ChatGPT model picker.^[8]^[9] o3-pro remained accessible via the OpenAI API after the GPT-5 launch, in line with OpenAI's stated policy of not deprecating older models on the API side at that time.^[8]

Background

Test-time compute and the "pro" line

OpenAI's "pro" model line emerged from research into test-time compute scaling, in which a reasoning model's accuracy on difficult problems is increased by allowing it to generate longer internal chain-of-thought traces and, in the "pro" configurations, by sampling or aggregating multiple candidate solutions at inference time.^[4]^[10] OpenAI's "o-series" reasoning models, beginning with the o1-preview release in September 2024 and continuing with o1, o3-mini, o3, and o4-mini, are trained with reinforcement learning to perform extended private reasoning before producing a final answer.^[11]^[4]

The first "pro" variant, o1 pro mode, was launched on December 5, 2024 as the headline feature of the new US$200 / month ChatGPT Pro subscription tier introduced during OpenAI's "12 Days of OpenAI" event.^[12]^[13] o1 pro mode was described by OpenAI as a version of o1 "that uses more compute to think harder and provide even better answers to the hardest problems," and OpenAI reported a pass rate of 86% on the AIME 2024 mathematics competition for o1 pro mode versus 78% for standard o1.^[12]^[13] o1 pro mode was initially available only inside ChatGPT; an API endpoint for the o1-pro model was added in March 2025.^[11]

Successor models and the path to o3-pro

OpenAI announced the o3 family at the end of December 2024 as the successor to o1, and released the standard o3 and o4-mini models for ChatGPT users on April 16, 2025.^[11]^[14] After the public release of o3, OpenAI CEO Sam Altman confirmed in mid-April 2025 that an o3-pro variant was planned for the ChatGPT Pro tier "in a few weeks."^[14] In the months between the o3 launch and the o3-pro release, OpenAI prepared the higher-compute variant alongside continued availability of o1 pro mode in the ChatGPT Pro picker.^[1]^[11]

The o3-pro launch on June 10, 2025 was announced via OpenAI's developer-facing channels and the ChatGPT release notes, accompanied by tweets from CEO Sam Altman noting both the 80% price cut for standard o3 and the new o3-pro pricing.^[5]^[15] Altman wrote on X: "we dropped the price of o3 by 80%!! excited to see what people will do with it now. think you'll also be happy with o3-pro pricing for the performance".^[15]

Architecture and reasoning

OpenAI has not published an architecture-level paper specific to o3-pro, and the available system card documentation covers the underlying o3 and o4-mini models rather than the pro variant.^[16] At a high level, o3-pro is described by OpenAI as "a version of our most intelligent model, OpenAI o3, designed to think longer and provide the most reliable responses,"^[1] indicating that o3-pro shares the underlying weights or model family of o3 but is operated under a configuration that consumes substantially more test-time compute per request.^[4]^[17]

Like other o-series models, o3-pro is trained with reinforcement learning to perform private chain-of-thought reasoning before producing a user-facing answer.^[11]^[17] OpenAI's standard explanation for the o-series describes a process in which the model produces a long internal reasoning trace, is rewarded during training on the correctness of its final answers, and develops behaviors such as self-correction, backtracking, and explicit problem decomposition. In o3-pro, this is paired with a higher effective compute budget; analyses of the model's behavior have suggested it samples or evaluates multiple candidate reasoning paths and selects the strongest, a strategy sometimes described as inference-time search or majority voting over chains of thought.^[4]^[18]

The "4/4 reliability" evaluation that OpenAI used to highlight o3-pro's improvements measures whether a model answers the same question correctly across four independent samples; on this metric o3-pro outperformed both standard o3 and o1 pro mode in OpenAI's internal evaluations, reflecting a focus on consistency rather than peak single-attempt accuracy.^[1]^[3]^[18]

API and runtime characteristics

In the OpenAI API, o3-pro is exposed exclusively through the Responses API, which supports multi-turn reasoning interactions and is described by OpenAI as the recommended interface for future-extensible features such as long-running tasks and structured tool use.^[19]^[20] Streaming is not supported on o3-pro at launch, and OpenAI recommends running o3-pro requests in "background mode," in which the API kicks off a request asynchronously and the client polls for completion, in order to avoid timeouts on long reasoning runs that may take several minutes.^[19]^[20]

Independent measurements of o3-pro's runtime characteristics on the Artificial Analysis benchmark service reported a median output rate of approximately 32.6 tokens per second, at the lower end of reasoning models in its price tier, and a time-to-first-token of approximately 107.85 seconds, reflecting the long internal reasoning phase before any output is streamed.^[21]

Availability and pricing

ChatGPT consumer availability

On June 10, 2025, o3-pro was made available inside ChatGPT to:

ChatGPT Pro subscribers (US$200 / month), where it replaced o1 pro mode in the model picker;^[1]^[2]
ChatGPT Team users;^[1]^[2]
ChatGPT Enterprise and ChatGPT Edu customers, who received access one week after the initial release.^[1]^[2]

ChatGPT Plus subscribers (US$20 / month) and ChatGPT Free users were not given access to o3-pro at launch; Plus tier access was confined to standard o3, o4-mini, and o4-mini-high.^[22]^[23] The exclusivity of o3-pro to the US$200 Pro tier and above was widely reported in the trade press as part of OpenAI's tier-segmentation strategy following the December 2024 introduction of ChatGPT Pro.^[22]^[12]

Inside ChatGPT, o3-pro retained access to the toolset that the consumer product exposes to other o-series models, including web search, file analysis, Python execution (advanced data analysis), vision-based reasoning over uploaded images, and memory-based personalization.^[1]^[24] At launch, however, several ChatGPT features were not supported by o3-pro: image generation (which routed users to other models), the Canvas workspace feature, and temporary chats, the latter being disabled while OpenAI resolved what it described as a technical issue.^[1]^[2]^[24]

API availability and pricing

The o3-pro API endpoint went live on June 10, 2025 alongside the ChatGPT release.^[5]^[6] OpenAI published the following list pricing for the model:^[5]^[6]^[21]^[25]

Model	Input (USD / 1M tokens)	Output (USD / 1M tokens)
OpenAI o3-pro	20.00	80.00
OpenAI o3 (post-June 10, 2025)	2.00	8.00
OpenAI o1-pro (prior list price)	150.00	600.00

The o3-pro list price represented an approximately 87% reduction relative to the o1-pro list price it succeeded.^[6]^[7] Standard o3 saw an 80% reduction on the same day, which OpenAI described as the result of inference-stack optimization rather than a model change.^[5]

The API model supports a 200,000-token context window and a maximum output of approximately 100,000 tokens per request, with input modalities that include text and images and a text-only output modality.^[25]^[26]^[27] OpenAI's API documentation lists support for function calling and structured outputs, and notes that streaming is not supported; the published knowledge cutoff for o3-pro is May 31, 2024 according to third-party model cards mirroring OpenAI's documentation, though documented sources vary slightly on the cutoff date.^[25]^[26]^[27]

Azure availability

Microsoft made o3-pro available through the Azure OpenAI Service (Azure AI Foundry) starting June 19, 2025, initially in the East US 2 and Sweden Central regions and supporting Global Standard deployment. Azure customers were required to request approval to use the model.^[28] Azure list pricing matched OpenAI's direct API pricing of US$20 / US$80 per million input/output tokens, and o3-pro on Azure supported the same Responses API surface with text and image inputs, function calling, structured outputs, and integration with Azure's File Search tool.^[28]

Capabilities

OpenAI positioned o3-pro as the most reliable model in the o3 family, recommending it for "challenging questions where reliability matters more than speed, and waiting a few minutes is worth the tradeoff."^[1]^[3] The product positioning emphasized three distinct capability dimensions:

Tool use

Like baseline o3, o3-pro can use the full set of ChatGPT tools as part of its reasoning, including:^[1]^[24]

Web search: live retrieval of up-to-date information from the web, used as part of the model's reasoning trace.
File analysis: ingestion and reasoning over uploaded documents and structured data.
Python execution (advanced data analysis): in-sandbox Python evaluation for computation and data manipulation.
Visual reasoning: analysis of uploaded images and visual prompts as part of multi-step reasoning.
Memory: personalization based on prior conversations within ChatGPT.

In the API, o3-pro supports function calling and structured outputs via the Responses API and integrates with file-search tools on cloud-deployment surfaces such as Azure AI Foundry.^[19]^[28]

Multimodal input

o3-pro accepts both text and image inputs in ChatGPT and through the API.^[25]^[28] A third-party multimodal evaluation by Roboflow placed o3-pro joint-third on its Vision AI Checkup leaderboard at the time of its release, with strong performance on optical character recognition (e.g., reading barcodes and serial numbers from images), spatial reasoning, and visual question answering, and weaker performance on counting and precise measurement tasks.^[29]

Excluded capabilities

At launch, several features supported by other ChatGPT models were not available within o3-pro:^[1]^[2]^[24]

Image generation was not supported (image generation requests routed users to other models).
The ChatGPT Canvas workspace did not work with o3-pro.
Temporary chats with o3-pro were initially disabled while OpenAI worked on a technical fix.
Streaming responses were not supported through the API; long-running requests were instead handled via background mode.^[19]^[20]

Benchmarks

OpenAI's published benchmark data for o3-pro at launch consisted primarily of internal "pass@1" and "4/4 reliability" comparisons on a small number of well-known reasoning benchmarks. Third-party reproductions and aggregations have largely confirmed the relative ordering of the numbers, although exact scores reported by different sources vary slightly. The following numbers were reported in OpenAI's announcement materials and in coverage by reasoning-model trackers.^[4]^[3]^[18]^[30]

Mathematics

On the AIME 2024 mathematics competition benchmark, o3-pro was reported at 93% on pass@1 and 90% under the stricter 4/4 reliability evaluation.^[18]^[30] Aggregator coverage placed o3-pro at approximately 94% on AIME 2024, ahead of Google's Gemini 2.5 Pro (~92% in OpenAI's comparison materials) on the same benchmark.^[4]^[30]^[3] Standard o3, by way of comparison, was variously reported by third-party trackers at approximately 90-96.7% on AIME 2024 depending on configuration.^[31]

PhD-level science

On GPQA Diamond, a benchmark of graduate-level multiple-choice science questions, o3-pro scored 84% on pass@1 and 76% under 4/4 reliability.^[18]^[30] On the same benchmark, third-party comparisons reported Claude Opus 4 at 83.3% and Gemini 2.5 Pro at approximately 84%, placing o3-pro near the top of the GPQA Diamond leaderboard among reasoning models available in June 2025.^[4]^[30]

Competitive programming

On Codeforces, o3-pro recorded a pass@1 Elo rating of 2,748 and a 4/4 reliability Elo of 2,301.^[18]^[30] The pass@1 figure was reported by external analyses as corresponding to roughly the 159th-highest active rating on the platform.^[4]^[30]

Reliability ("4/4")

OpenAI's signature evaluation for o3-pro was the "4/4 reliability" framework, in which a model is judged correct on a question only if it independently produces a correct answer in four out of four sampling attempts. On this evaluation, OpenAI reported that o3-pro outperformed both standard o3 and o1 pro mode across all four benchmarks tested.^[1]^[3]^[18]

Human preference comparisons

OpenAI also reported the results of human evaluations comparing o3-pro and standard o3 across multiple categories. Reviewers preferred o3-pro to o3 in every category tested, with an aggregated win rate of approximately 64% in favor of o3-pro; OpenAI specifically called out science, education, programming, business, and writing help as domains where the preference for o3-pro was strongest.^[1]^[4]^[32]

ARC-AGI

Third-party analyses of o3-pro on the ARC-AGI benchmark (which measures abstract pattern reasoning) suggested that o3-pro did not meaningfully outperform standard o3 on the ARC-AGI tasks despite consuming substantially more compute. Reviewers reported that the cost-per-task was approximately ten times higher than for standard o3 without a proportional accuracy gain, making o3-pro a poor cost/performance choice on this particular benchmark.^[4]

Independent intelligence indices

The Artificial Analysis "Intelligence Index" placed o3-pro at 41, above the median (35) for reasoning models in its price tier at the time of measurement, while flagging that o3-pro was "particularly expensive when comparing to other models of similar price."^[21]

Reception and use cases

Positive reception

Early reviews highlighted o3-pro's strength in long, context-heavy analytical tasks where reliability and depth of reasoning are valued over latency. A widely cited early review by Ben Hylak on the Latent Space newsletter, headlined "God is hungry for Context: First thoughts on o3 pro," argued that o3-pro's principal value showed up not in conversational use but in extended-report generation. After supplying o3-pro with comprehensive company background (meeting notes, prior plans, voice memos, and product context), the reviewer reported that o3-pro produced "specific and rooted" strategic plans that "actually changed how we are thinking about our future," and described the model as exhibiting better tool-selection and orchestration judgment than alternatives such as Claude Opus 4 and Gemini 2.5 Pro.^[17]

OpenAI's own positioning emphasized that o3-pro was preferred over standard o3 in every tested category, with reviewers especially favoring it on math, science, coding, business, and writing.^[1]^[3] Industry analysts characterized o3-pro as a "game-changer" for startups and small-to-medium businesses seeking access to frontier reasoning capabilities at substantially lower cost than the previous o1-pro tier.^[7]

Criticism

Mixed feedback emerged in the weeks after launch, focused on three themes:^[32]^[33]

Speed and overthinking on simple prompts. Multiple reviewers and developers reported that o3-pro often took several minutes to respond even to trivial inputs. A widely circulated example showed a developer prompting o3-pro with "Hi, I'm Sam Altman"; the model produced a brief reply only after between four and fourteen minutes of internal reasoning, at an estimated cost of around US$80 in tokens, illustrating the model's tendency to allocate large reasoning budgets even to simple conversational openings.^[33]
Hallucination concerns. Although OpenAI positioned o3-pro as its most reliable reasoning model, independent measurements suggested o3-pro did not substantially improve over o3 on hallucination rate. Some third-party comparisons reported o3-pro hallucination rates substantially higher than those of comparable models such as Gemini 2.5 Pro and Claude Opus 4 on certain summarization benchmarks.^[32]
Mobile and Canvas limitations. Users reported timeouts on Android and macOS clients due to long generation times, and the absence of Canvas and image generation made o3-pro a poor fit for some interactive workflows. ChatGPT users were typically required to switch back to standard o3 or GPT-4o for those features.^[32]^[2]

A separate strand of criticism, from cost-sensitive developers, pointed out that on cost/performance metrics, particularly on ARC-AGI tasks, o3-pro's roughly 10× higher per-task cost relative to standard o3 was not justified by corresponding accuracy gains on those benchmarks.^[4]

Use cases

OpenAI and reviewers identified several application areas where o3-pro's combination of long reasoning, tool use, and reliability gains were most useful:^[1]^[17]^[28]

Scientific analysis and PhD-level technical research.
Long-form strategic planning and report generation, given rich background context.
Software engineering and competitive programming.
Business analysis with multi-step orchestration of tools and documents.
Workflows on Azure AI Foundry combining o3-pro reasoning with enterprise file search.

Limitations

Beyond the missing ChatGPT features (image generation, Canvas, temporary chat) and the long latencies described above, o3-pro has several documented limitations:

No image generation output. o3-pro is a text-output model and cannot produce images directly; image generation in ChatGPT is routed to other models.^[1]^[2]
No streaming. The OpenAI API does not support streaming for o3-pro at launch; long requests must use background mode, with the client polling for completion.^[19]^[20]
High cost relative to alternatives. At US$20 / US$80 per million tokens, o3-pro is approximately 10× the price of standard o3 in the API. Independent analyses have noted that on many tasks the additional compute does not yield a proportionate quality gain.^[21]^[4]
Knowledge cutoff. o3-pro's training data is reported with a knowledge cutoff in mid-2024 (cited variously as May 31, 2024 or June 1, 2024 in third-party documentation that mirrors OpenAI's developer-facing material), and live information must be retrieved through the web-search tool.^[25]^[27]
Overthinking on simple prompts. Reviewers reported that o3-pro can spend substantial time on trivial inputs, making it ill-suited to short-turn conversational use without context.^[17]^[33]
Sustained hallucination rates. Despite OpenAI's reliability framing, o3-pro did not appear to materially improve over o3 on hallucination measurements in some independent comparisons.^[32]

Status under GPT-5

OpenAI rolled out GPT-5 in early August 2025. With the GPT-5 launch, OpenAI initially removed o3-pro (along with GPT-4o, o3, o4-mini, and several other models) from the ChatGPT consumer model picker; existing o3-pro chats inside ChatGPT were migrated to a new "GPT-5 Pro" mode, which OpenAI made available only on the Pro and Team tiers.^[8]^[9] Following user backlash over the abrupt removal of older models from ChatGPT, OpenAI restored access to some of the prior generation in the consumer product for a subset of users, with CEO Sam Altman acknowledging that the GPT-5 rollout had been "more bumpy than we hoped for."^[8]

On the API side, OpenAI told VentureBeat at the time of the GPT-5 launch that the company had no plans to deprecate older models, with a spokesperson stating that "in the API, we do not currently plan to deprecate older models."^[8] As a result, o3-pro remained accessible to API developers after the GPT-5 launch in the consumer ChatGPT app, even as the o3-pro name no longer appeared in the consumer product's model picker.

References

https://techcrunch.com/2025/06/10/openai-releases-o3-pro-a-souped-up-version-of-its-o3-ai-reasoning-model/ TechCrunch, "OpenAI releases o3-pro, a souped-up version of its o3 AI reasoning model," June 10, 2025. ↩
https://www.engadget.com/ai/openai-adds-the-o3-pro-model-to-chatgpt-today-212126136.html Engadget, "OpenAI adds the o3-pro model to ChatGPT today," June 10, 2025. ↩
https://www.cometapi.com/openai-releases-o3-pro/ CometAPI Blog, "OpenAI Releases o3-pro: Its Most Reliable AI Model Yet," June 2025. ↩
https://patmcguinness.substack.com/p/o3-pro-the-ai-that-thinks-too-much Patrick McGuinness, "o3-pro, the AI That Thinks Too Much," June 2025. ↩
https://community.openai.com/t/o3-is-80-cheaper-and-introducing-o3-pro/1284925 OpenAI Developer Community, "O3 is 80% cheaper and introducing o3-pro," June 10, 2025. ↩
https://www.maginative.com/article/openai-just-made-o3-pro-available-to-more-people-and-cut-the-price-by-87/ Maginative, "OpenAI Just Made o3-pro Available to More People and Cut the Price by 87%," June 2025. ↩
https://www.computerworld.com/article/4005122/openai-launches-o3-pro-slashes-o3-price-by-80-in-bid-to-widen-ai-lead.html Computerworld, "OpenAI launches o3-pro, slashes o3 price by 80% in bid to widen AI lead," June 2025. ↩
https://venturebeat.com/ai/chatgpt-users-dismayed-as-openai-pulls-popular-models-gpt-4o-o3-and-more-enterprise-api-remains-for-now VentureBeat, "ChatGPT users dismayed as OpenAI pulls popular models GPT-4o, o3 and more; enterprise API remains (for now)," August 2025. ↩
https://simonwillison.net/2025/Aug/8/surprise-deprecation-of-gpt-4o/ Simon Willison, "The surprise deprecation of GPT-4o for ChatGPT consumers," August 8, 2025. ↩
https://medium.com/wix-engineering/explaining-openais-o1-breakthrough-the-revolution-of-test-time-compute-ecebe8ef9379 Wix Engineering / Medium, "Explaining OpenAI's o1 Breakthrough: The Revolution of Test Time Compute," 2024. ↩
https://en.wikipedia.org/wiki/OpenAI_o3 Wikipedia, "OpenAI o3," accessed May 2026. ↩
https://techcrunch.com/2024/12/05/openai-confirms-its-new-200-plan-chatgpt-pro-which-includes-reasoning-models-and-more/ TechCrunch, "OpenAI confirms new $200 monthly subscription, ChatGPT Pro, which includes its o1 reasoning model," December 5, 2024. ↩
https://openai.com/index/introducing-chatgpt-pro/ OpenAI, "Introducing ChatGPT Pro," December 5, 2024. ↩
https://x.com/sama/status/1912558745013612888 Sam Altman on X, "we expect to release o3-pro to the pro tier in a few weeks," April 2025. ↩
https://x.com/sama/status/1932434606558462459 Sam Altman on X, "we dropped the price of o3 by 80%!! …," June 10, 2025. ↩
https://cdn.openai.com/pdf/2221c875-02dc-4789-800b-e7758f3722c1/o3-and-o4-mini-system-card.pdf OpenAI, "OpenAI o3 and o4-mini System Card," April 16, 2025. ↩
https://www.latent.space/p/o3-pro Latent Space (Ben Hylak), "God is hungry for Context: First thoughts on o3 pro," June 2025. ↩
https://www.rohan-paul.com/p/openai-releases-o3-pro-87-cheaper Rohan Paul, "OpenAI releases o3-pro, 87% cheaper than o1-pro & slashes o3 pricing by 80%," June 2025. ↩
https://developers.openai.com/api/docs/models/o3-pro OpenAI API Documentation, "o3-pro Model," accessed 2026. ↩
https://developers.openai.com/api/docs/guides/background OpenAI API Documentation, "Background mode," accessed 2026. ↩
https://artificialanalysis.ai/models/o3-pro Artificial Analysis, "o3-pro: Intelligence, Performance & Price Analysis," accessed 2026. ↩
https://www.bgr.com/tech/chatgpt-o3-pro-is-only-available-on-200-plans-heres-what-youre-missing/ BGR, "ChatGPT O3-Pro Is Only Available On $200+ Plans," 2025. ↩
https://x.com/OpenAI/status/1912560062004179424 OpenAI on X, "ChatGPT Plus, Pro, and Team users will see o3, o4-mini, and o4-mini-high in the model selector …," April 2025. ↩
https://www.geeky-gadgets.com/openai-o3-pro/ Geeky Gadgets, "OpenAI o3-pro: Advanced AI Reasoning at a Fraction of the Cost," June 2025. ↩
https://www.prompthub.us/models/o3-pro PromptHub, "o3 Pro Model Card," accessed 2026. ↩
https://openrouter.ai/openai/o3-pro OpenRouter, "o3 Pro: API Pricing & Providers," accessed 2026. ↩
https://llm-stats.com/models/o3-pro-2025-06-10 LLM Stats, "o3-pro: Pricing, Context Window, Benchmarks, and More," accessed 2026. ↩
https://devblogs.microsoft.com/foundry/azure-openai-o3-pro-ai-foundry/ Microsoft Foundry Blog, "o-series Updates: New o3 pricing and o3-pro in Azure AI Foundry," June 2025. ↩
https://blog.roboflow.com/openai-o3-pro-review/ Roboflow Blog, "OpenAI o3-pro: Multimodal and Vision Analysis," June 2025. ↩
https://www.sentisight.ai/openai-unveils-o3-pro-advanced-reasoning-model-surpasses-competition-in-benchmark-tests/ SentiSight, "OpenAI o3-pro AI Model Launch: Beats Google Gemini 2.5 Pro & Claude 4 Opus in Benchmarks," June 2025. ↩
https://aibusinessweekly.net/p/openai-o3-benchmarks AI Business Weekly, "OpenAI o3 Benchmark Results: AIME, GPQA & SWE-bench," 2025. ↩
https://www.infoq.com/news/2025/06/openai-o3-pro/ InfoQ, "OpenAI Launches o3-pro Model Focused on Reliability, Amid Mixed User Feedback," June 2025. ↩
https://the-decoder.com/openais-o3-pro-may-be-too-smart-for-small-talk/ The Decoder, "OpenAI's o3-pro may be too smart for small talk," June 2025. ↩

Improve this article

Add missing citations, update stale details, or suggest a clearer explanation. Every suggestion is reviewed for sourcing before it goes live.

3 revisions by 1 contributor · full history

Suggest edit

What links here

Q* OpenAI Strawberry (OpenAI codename)

Background

Test-time compute and the "pro" line

Successor models and the path to o3-pro

Architecture and reasoning

API and runtime characteristics

Availability and pricing

ChatGPT consumer availability

API availability and pricing

Azure availability

Capabilities

Tool use

Multimodal input

Excluded capabilities

Benchmarks

Mathematics

PhD-level science

Competitive programming

Reliability ("4/4")

Human preference comparisons

ARC-AGI

Independent intelligence indices

Reception and use cases

Positive reception

Criticism

Use cases

Limitations

Status under GPT-5

See also

References

Improve this article

Related Articles

OpenAI o1

OpenAI o3

OpenAI o-series

o4-mini

GPT-5 Pro

OpenAI o3-mini

What links here

Related Articles

OpenAI o1

OpenAI o3

OpenAI o-series

o4-mini

GPT-5 Pro

OpenAI o3-mini

What links here