OpenAI o3-mini

Large Language Models OpenAI Reasoning Models

22 min read

Updated Jun 23, 2026

Suggest edit History Talk

RawGraph

Last edited

Jun 23, 2026

Fact-checked

In review queue

Sources

25 citations

Revision

v3 · 4,303 words

Fact-checks are independent of edits: a reviewer re-verifies the article against its sources and stamps the date. How we verify

OpenAI o3-mini is a reasoning-focused large language model released by OpenAI on January 31, 2025, the second commercial member of the o-series after OpenAI o1 and a smaller, cheaper, faster sibling to the full o3 model that was previewed alongside it.^[1]^[2] It was the first OpenAI reasoning model made available to free ChatGPT users, and the first to support function calling, structured outputs, and developer messages from launch; it also introduced a user-selectable reasoning_effort parameter with low, medium, and high settings.^[2]^[24] OpenAI positioned the model as STEM-focused, calling it "a specialized alternative" to o1 "for technical domains requiring precision and speed" with particular strength in science, mathematics, and coding.^[1]^[24] At launch its API price was $1.10 per million input tokens and $4.40 per million output tokens, roughly 63% cheaper per input token than o1-mini and competitive with DeepSeek-R1, which had been released eleven days earlier on January 20, 2025.^[2]^[3]^[24] OpenAI cited benchmark results of about 83% on AIME 2024 and 77% on GPQA Diamond for the high-effort setting and a Codeforces rating near 2073 Elo.^[2] It was eventually superseded by o4-mini on April 16, 2025 and entered a deprecation window in the OpenAI API later that year.^[4]^[5]

Infobox

Field	Value
Developer	OpenAI
Announced	December 20, 2024 (preview)^[6]
General availability	January 31, 2025^[2]
API model ID	`o3-mini` and `o3-mini-2025-01-31`^[7]
Context window	200,000 tokens (max 100,000 output)^[7]
Knowledge cutoff	October 1, 2023^[7]
Modalities at launch	Text only (image input added in ChatGPT on February 13, 2025)^[8]
Reasoning effort levels	low, medium (default), high^[2]^[7]
API price (input / cached / output)	$1.10 / $0.55 / $4.40 per 1M tokens^[2]
Replaced by	o4-mini (April 16, 2025)^[4]

When was OpenAI o3-mini announced and released?

OpenAI introduced the o3 family during the final day of its "12 Days of OpenAI" livestream series on December 20, 2024. CEO Sam Altman and researchers Mark Chen, Hongyu Ren, and others demonstrated the full o3 model and announced a sibling called o3-mini, billed as a smaller, faster variant suited to coding and STEM tasks where the full model would be excessive.^[6] OpenAI skipped the "o2" name to avoid a trademark conflict with the British telecommunications brand O2, a detail Altman discussed during the stream.^[6] The December event opened a safety testing window: external red teamers and safety researchers could apply for early access to both o3 and o3-mini, with deployment dates contingent on the outcome of those evaluations.^[6]

On January 31, 2025, OpenAI shipped o3-mini in both ChatGPT and the API. The launch followed by eleven days the public release of DeepSeek-R1 on January 20, 2025, which had upended pricing expectations for reasoning models by offering comparable benchmark performance at a fraction of the cost.^[3]^[9] OpenAI's announcement page described o3-mini as "the newest, most cost-efficient model in our reasoning series" and emphasised that it was production-ready out of the gate thanks to function calling, structured outputs, and developer messages.^[2]^[24] Pro subscribers received unlimited access to both o3-mini and a higher-thinking variant called o3-mini-high; Plus and Team subscribers received 150 messages per day, triple the cap that had previously applied to o1-mini.^[2] Enterprise and Education customers received access "in a week," and free ChatGPT users could trigger the model by selecting a "Reason" button in the message composer, the first time a reasoning model had reached the free tier.^[2]^[25]

On February 12 and 13, 2025, OpenAI quietly enabled image upload in ChatGPT for o3-mini accounts, though vision inputs remained unsupported in the public API at that time.^[8] In subsequent weeks the company doubled the daily quota for the o3-mini-high variant to 50 messages per day for Plus subscribers as part of a broader uplift announced alongside expanded reasoning-model access.^[10]

The full o3 reasoning model and a new mini sibling, o4-mini, launched on April 16, 2025, and OpenAI's product page described the change in ChatGPT as a one-to-one upgrade in which o4-mini replaced o3-mini in the model picker.^[4]^[5] OpenAI kept o3-mini available in the API to preserve stable behaviour for production users that had built on it.^[5] On June 20, 2025, GitHub announced the deprecation of o1, GPT-4.5, o3-mini, and GPT-4o on GitHub Models, with o3-mini scheduled for retirement on July 18, 2025; that timeline was later revised after customer pushback, and OpenAI continued listing o3-mini in its API catalogue through the second half of 2025.^[11]

What can OpenAI o3-mini do?

What does the reasoning_effort parameter do?

The most prominent design decision in o3-mini is the introduction of an explicit reasoning_effort parameter, accepting the string values low, medium, and high. The parameter trades latency for accuracy by allowing the model to allocate more or fewer hidden reasoning tokens before producing the final answer.^[2]^[7] OpenAI's launch post describes the choice as a way to "think harder when tackling complex challenges or prioritize speed when latency is a concern."^[1]^[24] Medium effort is the default in both ChatGPT and the API, while developers can override the setting per API call.^[2]^[7] At high effort, OpenAI reported that o3-mini exceeded o1 on the headline math, science, and coding benchmarks while remaining cheaper to run.^[2] At low effort it roughly matched o1-mini, and at medium effort it roughly matched the full o1.^[24]

The three effort tiers showed a roughly monotonic improvement on benchmark scores. On the 2024 American Invitational Mathematics Examination, OpenAI reported pass@1 figures of about 60% at low effort, 79% at medium, and 83% at high effort with no Python tool use; the high-effort number reached approximately 87% when the same evaluation was rerun with a separate sample of the questions.^[2] Codeforces ratings rose from roughly 1831 (low) and 1997 (medium) to 2073 (high), placing o3-mini-high above 99% of human competitors in OpenAI's evaluation harness.^[2]

In ChatGPT, OpenAI exposed only two tiers to consumers: regular o3-mini (medium effort) and a separately badged o3-mini-high variant that ran at high effort and consumed a smaller daily quota.^[2]

Does o3-mini support function calling and structured outputs?

Yes. Unlike its predecessor o1-mini, o3-mini natively supports the function-calling API that OpenAI shipped for GPT-4 in 2023, the Structured Outputs feature added for GPT-4o in August 2024, and the new developer role that OpenAI introduced for the o-series in late 2024.^[2]^[24] The launch post called these "highly requested developer features" and framed o3-mini as OpenAI's "first small reasoning model that supports" them, "making it production-ready out of the gate," in contrast to o1 and o1-mini, both of which had launched without function calling or strict JSON-schema output enforcement.^[2]^[24]

Structured Outputs in o3-mini lets developers pass a JSON Schema in the API request and receive a response whose structure is guaranteed to validate against the schema, removing the need for retry-on-parse-error loops common in LLM pipelines.^[2]^[24] Function calling allows the model to emit machine-readable tool invocations against a developer-supplied list of available functions, enabling tool use, agentic workflows, and retrieval-augmented generation.^[2] Developer messages occupy a system-message-like slot in the request payload but are designed for the o-series, where models trained to reason internally treat them as high-trust instructions distinct from end-user content.^[2]^[24]

Which modalities does o3-mini support?

At the January 31, 2025 launch, o3-mini accepted only text input and produced only text output, with no vision or audio support.^[2]^[7] OpenAI's launch documentation acknowledged that the model "does not support vision capabilities, so developers should continue using OpenAI o1 for visual reasoning tasks."^[2]^[24] Two weeks later, on February 12 to 13, 2025, image upload became available for o3-mini chats in the ChatGPT web app, with OpenAI confirming the rollout publicly the following day; the API still rejected image_url payloads against the o3-mini endpoint at that time.^[8]

Does o3-mini support web search?

OpenAI shipped a prototype web-search integration alongside o3-mini in ChatGPT. The model can fetch up-to-date information and surface citations to source pages, though OpenAI labelled the feature as an early prototype while it worked to integrate search across the reasoning-model family.^[2]^[24] Full agentic web-search and tool use, including Python execution and image generation from within the chain of thought, was reserved for the later o4-mini and o3 launches in April 2025.^[4]

What are the context window and output limits?

The o3-mini model has a 200,000-token context window with a maximum output of 100,000 tokens, including the hidden reasoning tokens consumed at higher effort settings.^[7] The knowledge cutoff is October 1, 2023, the same as o1.^[7]

How does OpenAI o3-mini perform on benchmarks?

OpenAI's launch post for o3-mini included evaluation results across mathematics, science, coding, and tool-use benchmarks. Unless stated otherwise the figures below are pass@1 with no external tools and follow OpenAI's reported numbers for the three effort levels.^[2]

Mathematics

On the 2024 American Invitational Mathematics Examination competition problems, OpenAI reported scores of approximately 60% at low effort, 79% at medium, and 83% at high effort.^[2] An additional evaluation that drew a separate set of AIME questions and used a different sampling configuration placed o3-mini-high at roughly 87.3%, matching the figure most often cited in coverage of the launch.^[12]

On the MATH dataset, o3-mini scored above 95% at high effort, slightly exceeding o1 and far above GPT-4o.^[2] On the FrontierMath research-math benchmark introduced by Epoch AI in late 2024, OpenAI reported that o3-mini at high effort with access to a Python tool solved over 32% of problems on the first attempt, including more than 28% of the "T3" hardest problems.^[2] For context, prior to the o-series no public model had exceeded roughly 2% on FrontierMath in its initial release.^[6]

Science

On GPQA Diamond, the 198-question subset of GPQA covering graduate-level physics, chemistry, and biology questions, OpenAI reported pass@1 scores of about 60% at low effort, 71% at medium, and 77% at high effort for o3-mini.^[2] The high-effort score of roughly 77% slightly exceeded o1's 75.7% on the same benchmark.^[2]

Coding

On the Codeforces competitive-programming Elo ladder, OpenAI's harness assigned ratings of approximately 1831, 1997, and 2073 to o3-mini at low, medium, and high effort respectively.^[2] The high-effort 2073 rating placed o3-mini above roughly 99% of human Codeforces competitors but well below the 2727 rating later reported for the full o3 model.^[6]

On SWE-bench Verified, the human-curated 500-problem subset of SWE-bench drawn from real GitHub pull requests, o3-mini scored 39.0% at low effort and 49.3% at high effort.^[2] The high-effort 49.3% figure compared favourably to o1's 48.9% on the same evaluation.^[2] On the LiveBench coding category, OpenAI reported that o3-mini outperformed o1 across effort tiers.^[13]

Knowledge and general capabilities

A notable trade-off in o3-mini was its weaker performance on broad knowledge benchmarks. On the MMLU multitask language understanding benchmark, o3-mini at high effort scored 86.9%, several points below o1's reported 92.3%, reflecting the model's relative narrowing of training toward STEM reasoning rather than encyclopedic recall.^[13]

Comparative scoreboard

Benchmark	o3-mini (high)	o1	DeepSeek-R1	Source
AIME 2024 (pass@1)	~83% (extended eval ~87%)	83.3%	79.8%	^[2]^[9]^[14]
GPQA Diamond	~77%	75.7%	71.5%	^[2]^[9]
MATH-500	~97%	~96%	97.3%	^[9]^[13]
Codeforces Elo	2073	1891	2029	^[2]^[9]
SWE-bench Verified	49.3%	48.9%	49.2%	^[2]^[9]
FrontierMath (with Python)	>32%	not reported	not reported	^[2]
MMLU	86.9%	92.3%	90.8%	^[13]^[9]

How does o3-mini compare on human preference and speed?

OpenAI also reported the results of an "external expert tester" study in which evaluators preferred o3-mini responses to o1-mini responses 56% of the time and judged that o3-mini reduced "major mistakes" on real-world hard questions by 39%.^[2]^[24] The same study found average response latency was 24% faster than o1-mini, at 7.7 seconds versus 10.16 seconds, with an average time-to-first-token about 2500 milliseconds faster than o1-mini.^[2]^[24]

How was OpenAI o3-mini trained?

OpenAI has not published a detailed architectural report for o3-mini. The o3-mini system card describes the model as a member of the o-series trained with large-scale reinforcement learning to produce long internal chain-of-thought traces before responding.^[15] The system card states that o3-mini is "the latest model in our reasoning series" and that its training continues the recipe introduced with o1, in which reinforcement learning teaches the model to reason and refine its strategies during inference rather than at training time.^[15] Like the rest of the o-series, o3-mini consumes a configurable number of hidden reasoning tokens during inference, and OpenAI bills those reasoning tokens at the output rate even though they are not visible in the final response.^[7]

Beyond reasoning training, OpenAI applied a safety-training procedure called deliberative alignment to o3-mini. Introduced in a paper published alongside the December 2024 o3 preview, deliberative alignment teaches a model to explicitly reason through OpenAI's safety policy text before producing an answer, rather than relying only on examples of compliant and refusal behaviour.^[16] OpenAI's deliberative-alignment paper lists o1-preview, o1, and o3-mini as the production models trained with the technique, and the o3-mini system card credits deliberative alignment for improvements in jailbreak resistance and policy compliance.^[16]^[15]

How safe is OpenAI o3-mini?

The o3-mini system card, dated January 31, 2025 and revised on February 10, 2025, evaluates o3-mini against OpenAI's Preparedness Framework in four risk categories: Cybersecurity, CBRN (chemical, biological, radiological, and nuclear uplift), Persuasion, and Model Autonomy.^[15] OpenAI's internal Safety Advisory Group assigned the pre-mitigation model an overall rating of Medium, with Medium ratings in Persuasion, CBRN, and Model Autonomy and a Low rating in Cybersecurity.^[15] The system card highlights that o3-mini is the first OpenAI model to reach Medium risk in Model Autonomy, a classification driven by improvements in software-engineering performance on benchmarks such as SWE-bench Verified.^[15]

On jailbreak evaluations, the system card reports that o3-mini performs comparably to or better than o1 on suites including StrongREJECT and on internal disallowed-content suites, and that it shows large gains over GPT-4o on adversarial prompts.^[15] OpenAI attributes those gains primarily to deliberative alignment.^[16] Within days of public release, a CyberArk researcher reported coaxing o3-mini into producing exploitation guidance for the Windows LSASS process, illustrating that the model's safety improvements did not eliminate residual vulnerabilities.^[17]

How much does OpenAI o3-mini cost, and who could use it?

What was the o3-mini API price?

At launch on January 31, 2025, OpenAI listed o3-mini at $1.10 per million input tokens, $0.55 per million cached input tokens, and $4.40 per million output tokens.^[2]^[7] The Batch API offered an additional 50% discount on the same rates.^[13] Compared to o1-mini's $3 per million input and $12 per million output tokens, the o3-mini launch price represented a roughly 63% reduction in cost per input token.^[2]^[24] Compared to the full o1 at $15 per million input and $60 per million output tokens, o3-mini's input pricing was roughly 93% lower.^[18]

The headline output price of $4.40 per million tokens matched the price of DeepSeek-R1 on third-party hosts, although DeepSeek's official API at the time offered the same model at lower rates (approximately $0.55 input and $2.19 output per million tokens), and DeepSeek-R1's open-weight licence allowed self-hosting at marginal cost.^[9] OpenAI's pricing decision was widely interpreted in trade press as a competitive response to DeepSeek's January 20 release.^[3]^[19]

Which API tiers could access o3-mini?

API access at launch was restricted to developers in OpenAI's usage tiers 3 through 5, which require historical spend thresholds.^[13] Standard, real-time, batch, and Assistants API surfaces all supported the model.^[2]^[7] Streaming was supported, as were function calling and Structured Outputs.^[2]^[7] Image inputs were unsupported at launch.^[2]

Was o3-mini available on the free ChatGPT tier?

Yes. In ChatGPT, o3-mini was available to Plus, Team, and Pro subscribers from launch day, with Enterprise and Education access following within a week.^[2] Plus and Team subscribers received 150 messages per day, three times the previous o1-mini limit of 50.^[2] Pro subscribers received unlimited access to both o3-mini and o3-mini-high.^[2] Free ChatGPT users could invoke the model by selecting "Reason" in the message composer or by regenerating a response, with a smaller daily quota; MIT Technology Review noted at the time that the launch "will mark the first time that the vast majority of people will have access to one of OpenAI's reasoning models."^[2]^[25] The o3-mini-high variant, exposed in the ChatGPT model picker for Plus subscribers, allowed a small number of high-effort messages per day; OpenAI later doubled the Plus quota for o3-mini-high from an initial 25 to 50 messages per day.^[10]

How was OpenAI o3-mini received?

Trade press coverage of the January 31 launch concentrated on three themes: cost, the competitive context with DeepSeek-R1, and OpenAI's expansion of reasoning-model access to free ChatGPT users. TechCrunch's launch report described o3-mini as OpenAI's "latest reasoning model" and emphasised that it was OpenAI's first reasoning model to support function calling and Structured Outputs, calling it "production-ready out of the gate."^[3] VentureBeat framed the release as "OpenAI's o3-mini advanced reasoning model arrives to counter DeepSeek's rise."^[19] MacRumors's headline characterised the model as "rivaling DeepSeek," a framing some Hacker News commenters disputed, pointing out that DeepSeek-R1 remained roughly an order of magnitude cheaper at the API level and was distributed under an open licence.^[20]

The DeepLearning.AI Batch newsletter noted that o3-mini "trades broad knowledge for domain-specific reasoning strength," pointing to the lower MMLU score relative to o1 as evidence of the trade-off.^[13] The newsletter also highlighted that o3-mini's design made it possible to dial reasoning effort up or down via a single API parameter, a feature most other lab models did not expose.^[13]

Independent evaluators on the Artificial Analysis benchmark suite recorded o3-mini at output speeds around 168 tokens per second, well above the median of comparable reasoning models, but with a time-to-first-token of about 6.84 seconds, reflecting the latency tax of internal reasoning before the first visible token.^[21]

Within a week of launch, security researchers had reported jailbreaks of the model. A CyberArk principal vulnerability researcher demonstrated obtaining instructions for an exploit of the Windows LSASS process, an episode that received coverage in Dark Reading and prompted further discussion of deliberative alignment's robustness.^[17]

How does OpenAI o3-mini compare to other reasoning models?

How does o3-mini differ from OpenAI o1 and o1-mini?

OpenAI o1 launched in September 2024 and was the first publicly released model in the o-series.^[22] o3-mini at medium effort approximately matched o1's headline performance on AIME 2024, GPQA Diamond, and SWE-bench Verified at substantially lower cost and latency, while o3-mini at high effort modestly exceeded o1 on most reasoning benchmarks at the cost of additional thinking time.^[2] o3-mini's main advantages over o1-mini were function calling, Structured Outputs, developer messages, higher math and coding benchmark scores, and lower price.^[2] Its main disadvantage relative to both o1 and o1-mini at launch was the absence of vision input, which OpenAI explicitly recommended developers use o1 for instead.^[2]^[24]

How does o3-mini differ from o4-mini?

OpenAI o4-mini launched on April 16, 2025, and superseded o3-mini in the ChatGPT model picker. OpenAI characterised o4-mini as a direct, one-to-one upgrade in ChatGPT and added image-reasoning ability, fuller tool use, and agentic web search.^[4]^[5] On AIME 2025 OpenAI reported o4-mini scoring 93.4% versus a comparable o3-mini score in the high-80s, and on Codeforces o4-mini reached approximately 2719 Elo versus o3-mini-high's 2073.^[4]^[14] The o4-mini API price at launch matched o3-mini's $1.10 input and $4.40 output per million tokens.^[4]

How does o3-mini differ from DeepSeek-R1?

DeepSeek-R1, released by DeepSeek-AI on January 20, 2025, was the immediate competitive backdrop for o3-mini's launch.^[9] On AIME 2024, DeepSeek-R1 scored 79.8% versus o3-mini-high's ~83%; on Codeforces, DeepSeek-R1's reported Elo of 2029 was close to o3-mini-high's 2073; on MATH-500 DeepSeek-R1 scored 97.3% versus o3-mini's roughly 97%.^[9] DeepSeek-R1 came with open weights under the MIT licence and a hosted API priced near $0.55 input and $2.19 output per million tokens, making it materially cheaper than o3-mini on hosted inference.^[9] Coverage of the launches typically framed o3-mini as faster, more reliable, and stronger at coding and tool use, while DeepSeek-R1 was framed as cheaper, open, and stronger at pure mathematics.^[14]

How does o3-mini compare to Claude 3.5 Sonnet and Claude 3.7 Sonnet?

Claude 3.5 Sonnet from Anthropic, updated in October 2024, was the dominant commercial coding model at the time of o3-mini's launch. On SWE-bench Verified, Claude 3.5 Sonnet's reported score of about 49.0% was within a percentage point of o3-mini's 49.3%, although Sonnet did not perform explicit reasoning traces in the same style as the o-series.^[2] Claude 3.7 Sonnet, released by Anthropic on February 24, 2025, less than four weeks after o3-mini, introduced an extended-thinking mode that brought reasoning-style behaviour to the Claude family and shifted the competitive landscape again.

Where does o3-mini sit in the OpenAI model lineup?

o3-mini sat in the middle of OpenAI's 2025 reasoning lineup. Above it was the full o3 model, which OpenAI had previewed in December 2024 but did not release in general availability until April 16, 2025.^[4] Below it, on a feature level, was o1-mini, which o3-mini replaced in the ChatGPT model picker on the launch day.^[2] The full o1 remained available for visual reasoning tasks until o4-mini and o3 added image input in the April 16 release.^[4]

OpenAI's broader 2025 roadmap moved on quickly. The full o4-mini and o3 launched on April 16, 2025, and OpenAI's o-series page identifies them as the new generation.^[4] Later in the year the company released GPT-5 in August 2025, with subsequent point updates (GPT-5, GPT-5.1, GPT-5.2, and so on), each consolidating reasoning behaviour into the main chat model and reducing the visibility of the separate o-series brand.^[23]

API model identifiers and configuration

OpenAI exposes o3-mini in the API under two identifiers: o3-mini (an alias that may be updated) and the dated snapshot o3-mini-2025-01-31.^[7] Requests accept a reasoning_effort parameter with values low, medium (default), or high; a developer role on the messages payload; a response_format with type: "json_schema" for Structured Outputs; and a tools array for function calling.^[2]^[7]

The model card on the OpenAI developer documentation lists a context window of 200,000 tokens, maximum output tokens of 100,000, and a knowledge cutoff of October 1, 2023.^[7] Reasoning tokens are billed at the output rate and counted against the maximum-output budget.^[7]

When was OpenAI o3-mini deprecated?

OpenAI did not deprecate o3-mini at the time of the o4-mini release on April 16, 2025; the company said in customer communications that o3-mini would remain in the API to preserve stable behaviour for production users that had built atop it.^[5] On June 20, 2025, GitHub announced a coordinated deprecation of o1, GPT-4.5, o3-mini, and GPT-4o on GitHub Models, with o3-mini scheduled for retirement on July 18, 2025.^[11] That timeline was subsequently revised after customer feedback, and OpenAI continued to list o3-mini in its API model catalogue through the second half of 2025 even as o4-mini became the default reasoning-mini model in ChatGPT, with the o3-mini-2025-01-31 snapshot later marked deprecated in the API documentation.^[7]^[11]^[5]

References

OpenAI, "OpenAI o3-mini", openai.com, 2025-01-31. https://openai.com/index/openai-o3-mini/ Accessed 2026-05-25. ↩
Kyle Wiggers, "OpenAI launches o3-mini, its latest 'reasoning' model", TechCrunch, 2025-01-31. https://techcrunch.com/2025/01/31/openai-launches-o3-mini-its-latest-reasoning-model/ Accessed 2026-05-25. ↩
Maxwell Zeff, "OpenAI launches o3-mini, its latest 'reasoning' model", TechCrunch, 2025-01-31. https://techcrunch.com/2025/01/31/openai-launches-o3-mini-its-latest-reasoning-model/ Accessed 2026-05-25. ↩
OpenAI, "Introducing OpenAI o3 and o4-mini", openai.com, 2025-04-16. https://openai.com/index/introducing-o3-and-o4-mini/ Accessed 2026-05-25. ↩
Hacker News thread quoting an OpenAI staff post, "In ChatGPT, o4-mini is replacing o3-mini. It's a straight 1-to-1 upgrade", news.ycombinator.com, 2025-04-16. https://news.ycombinator.com/item?id=43708895 Accessed 2026-05-25. ↩
Kyle Wiggers, "OpenAI announces new o3 model", TechCrunch, 2024-12-20. https://techcrunch.com/2024/12/20/openai-announces-new-o3-model/ Accessed 2026-05-25. ↩
OpenAI, "o3-mini Model", developers.openai.com (API documentation), 2025. https://developers.openai.com/api/docs/models/o3-mini Accessed 2026-06-23. ↩
James Gallagher, "OpenAI o3-mini: Vision and Multimodal Features", Roboflow Blog, 2025-02-13. https://blog.roboflow.com/o3-mini-multimodal/ Accessed 2026-05-25. ↩
DeepSeek-AI et al., "DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning", DeepSeek-AI technical report and arXiv, 2025-01-22. https://arxiv.org/abs/2501.12948 Accessed 2026-05-25. ↩
Seifeur Guizeni, "OpenAI o3-mini High Limits Offer Enhanced Performance and Developer Access", seifeur.com, 2025-02. https://seifeur.com/openai-o3-mini-high-limits/ Accessed 2026-05-25. ↩
GitHub Changelog, "Upcoming deprecation of o1, GPT-4.5, o3-mini, and GPT-4o", github.blog, 2025-06-20. https://github.blog/changelog/2025-06-20-upcoming-deprecation-of-o1-gpt-4-5-o3-mini-and-gpt-4o/ Accessed 2026-05-25. ↩
Wikipedia contributors, "OpenAI o3", en.wikipedia.org, 2025-2026 (drawing on the OpenAI o3-mini launch post for the cited AIME high-effort figure). https://en.wikipedia.org/wiki/OpenAI_o3 Accessed 2026-05-25. ↩
DeepLearning.AI, "The Batch, Issue 287: o3-mini Puts Reasoning in High Gear", deeplearning.ai, 2025-02. https://www.deeplearning.ai/the-batch/issue-287 Accessed 2026-05-25. ↩
Mehul Gupta, "OpenAI o3-mini vs DeepSeek-R1", Data Science in Your Pocket / Medium, 2025-01-31. https://medium.com/data-science-in-your-pocket/openai-o3-mini-vs-deepseek-r1-23326fa36e4b Accessed 2026-05-25. ↩
OpenAI, "OpenAI o3-mini System Card", cdn.openai.com, 2025-01-31 (revised 2025-02-10). https://cdn.openai.com/o3-mini-system-card-feb10.pdf Accessed 2026-05-25. ↩
Melody Y. Guan et al., "Deliberative Alignment: Reasoning Enables Safer Language Models", arXiv:2412.16339 and openai.com, 2024-12-22. https://arxiv.org/abs/2412.16339 Accessed 2026-05-25. ↩
Nate Nelson, "Researcher Outsmarts, Jailbreaks OpenAI's New o3-mini", Dark Reading, 2025-02. https://www.darkreading.com/application-security/researcher-jailbreaks-openai-o3-mini Accessed 2026-05-25. ↩
OpenAI, "OpenAI API Pricing", openai.com/api/pricing/, 2024-2025. https://openai.com/api/pricing/ Accessed 2026-05-25. ↩
Carl Franzen, "It's here: OpenAI's o3-mini advanced reasoning model arrives to counter DeepSeek's rise", VentureBeat, 2025-01-31. https://venturebeat.com/ai/its-here-openais-o3-mini-advanced-reasoning-model-arrives-to-counter-deepseeks-rise Accessed 2026-05-25. ↩
Joe Rossignol, "OpenAI Launches o3-mini, a Cost-Efficient Reasoning Model Rivaling DeepSeek", MacRumors, 2025-01-31. https://www.macrumors.com/2025/01/31/openai-o3-mini-launch/ Accessed 2026-05-25. ↩
Artificial Analysis, "o3-mini: Intelligence, Performance & Price Analysis", artificialanalysis.ai, 2025. https://artificialanalysis.ai/models/o3-mini Accessed 2026-05-25. ↩
OpenAI, "Learning to reason with LLMs", openai.com, 2024-09-12. https://openai.com/index/learning-to-reason-with-llms/ Accessed 2026-05-25. ↩
TechTarget WhatIs, "OpenAI o3 and o4 explained: Everything you need to know", techtarget.com, 2025-2026. https://www.techtarget.com/whatis/feature/OpenAI-o3-explained-Everything-you-need-to-know Accessed 2026-05-25. ↩
OpenAI, "OpenAI o3-mini" (official launch announcement, capabilities, developer features, and pricing), openai.com, 2025-01-31. https://openai.com/index/openai-o3-mini/ Accessed 2026-06-23. ↩
Will Douglas Heaven, "OpenAI releases its new o3-mini reasoning model for free", MIT Technology Review, 2025-01-31. https://www.technologyreview.com/2025/01/31/1110757/openai-makes-its-reasoning-model-for-free/ Accessed 2026-06-23. ↩

Improve this article

Add missing citations, update stale details, or suggest a clearer explanation. Every suggestion is reviewed for sourcing before it goes live.

2 revisions by 1 contributors · full history

Suggest edit

What links here

Aider Polyglot BIG-Bench Extra Hard GPT Model Timeline (GPT-1 to GPT-5.x)GSO OpenAI o1 OpenAI o1-mini PaperBench Phi-4 Reasoning Q* OpenAI Sleep-time compute Structured output SuperGPQA

Infobox

When was OpenAI o3-mini announced and released?

What can OpenAI o3-mini do?

What does the reasoning_effort parameter do?

Does o3-mini support function calling and structured outputs?

Which modalities does o3-mini support?

Does o3-mini support web search?

What are the context window and output limits?

How does OpenAI o3-mini perform on benchmarks?

Mathematics

Science

Coding

Knowledge and general capabilities

Comparative scoreboard

How does o3-mini compare on human preference and speed?

How was OpenAI o3-mini trained?

How safe is OpenAI o3-mini?

How much does OpenAI o3-mini cost, and who could use it?

What was the o3-mini API price?

Which API tiers could access o3-mini?

Was o3-mini available on the free ChatGPT tier?

How was OpenAI o3-mini received?

How does OpenAI o3-mini compare to other reasoning models?

How does o3-mini differ from OpenAI o1 and o1-mini?

How does o3-mini differ from o4-mini?

How does o3-mini differ from DeepSeek-R1?

How does o3-mini compare to Claude 3.5 Sonnet and Claude 3.7 Sonnet?

Where does o3-mini sit in the OpenAI model lineup?

API model identifiers and configuration

When was OpenAI o3-mini deprecated?

See also

References

Improve this article

Related Articles

OpenAI o1

OpenAI o3

OpenAI o-series

o4-mini

OpenAI o3-pro

GPT-5 Pro

What links here

Related Articles

OpenAI o1

OpenAI o3

OpenAI o-series

o4-mini

OpenAI o3-pro

GPT-5 Pro

What links here