GPT-4.1 mini

GPT-4.1 mini is a large language model developed by OpenAI and released on April 14, 2025 as the mid-size member of the GPT-4.1 family. Positioned between the full GPT-4.1 model and the smaller GPT-4.1 nano, it was designed to deliver intelligence competitive with GPT-4o while substantially reducing latency and cost. OpenAI reported that GPT-4.1 mini matches or exceeds GPT-4o on many benchmark evaluations while cutting latency by roughly half and lowering cost by about 83 percent. ^[1]^[2]

Overview

GPT-4.1 mini was introduced alongside GPT-4.1 and GPT-4.1 nano on April 14, 2025, in a launch that was initially available only through the OpenAI API rather than in ChatGPT. ^[1]^[3] All three models share a context window of up to roughly 1,000,000 tokens and a knowledge cutoff of June 2024, and the family emphasizes improvements in coding, instruction following, and long-context comprehension over GPT-4o. ^[1]^[2]

The "mini" tier is intended for developers who need most of GPT-4.1's capability at a lower price point and faster response time. OpenAI described GPT-4.1 mini as a model that "delivers performance competitive with GPT-4o at substantially lower latency and cost," making it suitable for high-volume production workloads where cost and speed matter as much as raw capability. ^[1]^[4] It supports both text and image inputs and produces text output; audio and video inputs are not supported. ^[5]

The release reflected a broader pattern in OpenAI's product strategy of offering tiered model families, in which a flagship model is accompanied by progressively smaller and cheaper variants tuned for different workloads. Within that structure GPT-4.1 mini occupies the middle, trading a modest amount of capability relative to the full GPT-4.1 model for roughly a fifth of its input price, while remaining markedly more capable than the entry-level nano variant. ^[1]

The GPT-4.1 family

The GPT-4.1 family consists of three models released together on April 14, 2025:

GPT-4.1, the flagship model, aimed at the most demanding coding and reasoning tasks.
GPT-4.1 mini, the mid-size model balancing capability, latency, and cost.
GPT-4.1 nano, the smallest and fastest model, intended for lightweight tasks such as classification and autocompletion.

All three were API-only at launch. OpenAI positioned the family as a successor to GPT-4o for developers, citing major gains on the SWE-bench Verified coding benchmark and on Scale AI's MultiChallenge instruction-following benchmark, along with improved ability to use very long contexts. ^[1] At the same livestream, OpenAI announced that the GPT-4.5 Preview model (sometimes referred to by the codename Orion) would be deprecated and turned off in the API on July 14, 2025, giving developers three months to migrate. ^[1]^[6] OpenAI noted that GPT-4.1 offered "improved or similar performance" to GPT-4.5 on many key capabilities at much lower cost and latency. ^[1]

Although the family launched in the API only, OpenAI later brought GPT-4.1 to ChatGPT. On May 14, 2025, GPT-4.1 became available to ChatGPT Plus, Pro, and Team subscribers, while GPT-4.1 mini replaced GPT-4o mini as the lightweight fallback model for all ChatGPT users, including those on the free tier. ^[3]

Capabilities

GPT-4.1 mini was built to retain the core strengths of the GPT-4.1 family, namely coding, instruction following, and long-context handling, while operating faster and more cheaply than GPT-4o. OpenAI stated that the model "matches or exceeds GPT-4o" on many intelligence evaluations despite its smaller size and lower price. ^[1]^[2]

Key capability claims from OpenAI include:

Latency: GPT-4.1 mini reduces response latency by approximately 50 percent compared with GPT-4o. ^[2]^[4]
Cost: It lowers cost by about 83 percent relative to GPT-4o. ^[1]^[2]
Long context: Like the rest of the family, it accepts inputs of up to roughly 1,000,000 tokens, with improvements in retrieving and reasoning over information spread across very long documents. ^[1]
Multimodal input: It can process images as well as text, and performs strongly on visual reasoning benchmarks. ^[1]^[5]

OpenAI noted that GPT-4.1 mini is also more capable than GPT-4o mini, although that gain comes with a small increase in latency relative to the older, smaller model. ^[4] On some image and reasoning benchmarks, GPT-4.1 mini performs close to the full GPT-4.1 model. ^[4]

The family as a whole was tuned to follow instructions more literally and reliably than GPT-4o, which OpenAI cited as one of its most-requested developer improvements. GPT-4.1 and its smaller siblings were trained to adhere more closely to formatting requirements, ordering of steps, and explicit constraints in prompts, behavior that benefits agentic and tool-using applications. These instruction-following gains carry over to GPT-4.1 mini, which scored 84.1 percent on the public IFEval benchmark and 35.8 percent on Scale AI's MultiChallenge measure of multi-turn instruction adherence. ^[1]^[7]

OpenAI also documented a known limitation that applies across the family: reasoning accuracy degrades as the input grows toward the maximum context length. In OpenAI's own long-context evaluation, accuracy fell from roughly 84 percent at 8,000 tokens to about 50 percent at 1,000,000 tokens, indicating that the headline million-token window does not guarantee uniform performance across an entire maximally sized input. ^[3]

Benchmarks

OpenAI published benchmark results for the GPT-4.1 family at launch. For GPT-4.1 mini specifically, reported scores include the following. ^[1]^[7]

Benchmark	What it measures	GPT-4.1 mini
MathVista	Visual mathematical reasoning	73.1%
MultiChallenge (Scale AI)	Multi-turn instruction following	35.8%
IFEval	Instruction following	84.1%
Hard instruction-following eval (OpenAI internal)	Difficult instruction adherence	45.1%
Aider polyglot (diff format)	Code editing across languages	31.6%

On MathVista, GPT-4.1 mini slightly outscored the full GPT-4.1 model, illustrating how close the mid-size model can come to the flagship on certain tasks. ^[4] OpenAI emphasized that, taken together, these results show GPT-4.1 mini meeting or beating GPT-4o across a broad set of intelligence evaluations. ^[1]^[2]

It is important not to confuse GPT-4.1 mini's scores with those of GPT-4.1 nano. The widely cited figures of 80.1 percent on MMLU and 50.3 percent on GPQA belong to GPT-4.1 nano, the smallest model in the family, not to GPT-4.1 mini. ^[1]

Pricing and context window

GPT-4.1 mini uses standard per-token pricing, with a discounted rate for cached input tokens. OpenAI listed the following prices at launch. ^[5]^[7]

Model	Input (per 1M tokens)	Cached input (per 1M tokens)	Output (per 1M tokens)	Context window
GPT-4.1	$2.00	$0.50	$8.00	~1,000,000
GPT-4.1 mini	$0.40	$0.10	$1.60	~1,000,000
GPT-4.1 nano	$0.10	$0.025	$0.40	~1,000,000
GPT-4o	$2.50	$1.25	$10.00	128,000
GPT-4o mini	$0.15	$0.075	$0.60	128,000

The roughly 83 percent cost reduction OpenAI cited for GPT-4.1 mini reflects the gap between its $0.40 input and $1.60 output prices and GPT-4o's $2.50 input and $10.00 output prices. ^[1]^[2]^[8] The exact context window for GPT-4.1 mini is 1,047,576 tokens, with a maximum output of 32,768 tokens, and its knowledge cutoff is June 2024. ^[5]

Availability

At launch on April 14, 2025, GPT-4.1 mini was available exclusively through the OpenAI API, exposed under the model identifiers gpt-4.1-mini and the dated snapshot gpt-4.1-mini-2025-04-14. ^[3]^[5] It was not initially offered in ChatGPT, where OpenAI said improvements were being folded into GPT-4o on a separate track. ^[1]

On May 14, 2025, OpenAI added GPT-4.1 to ChatGPT for paying subscribers, and GPT-4.1 mini became the lightweight model available to all ChatGPT users, taking the place previously held by GPT-4o mini. ^[3] The GPT-4.1 mini model is also accessible through Microsoft's Azure OpenAI Service and third-party API aggregators that mirror OpenAI's published pricing and benchmarks. ^[4]

References

OpenAI, "Introducing GPT-4.1 in the API," April 14, 2025. https://openai.com/index/gpt-4-1/
TechTarget, "GPT-4.1 explained: Everything you need to know." https://www.techtarget.com/whatis/feature/GPT-41-explained-Everything-you-need-to-know
TechCrunch, "OpenAI's new GPT-4.1 models focus on coding," April 14, 2025. https://techcrunch.com/2025/04/14/openais-new-gpt-4-1-models-focus-on-coding/
DataCamp, "GPT-4.1: Features, Access, GPT-4o Comparison, and More." https://www.datacamp.com/blog/gpt-4-1
OpenAI, "GPT-4.1 mini Model" (API documentation). https://developers.openai.com/api/docs/models/gpt-4.1-mini
InfoQ, "OpenAI Introduces GPT-4.1 Family with Enhanced Performance and Long-Context Support," May 2025. https://www.infoq.com/news/2025/05/openai-gpt-4-1/
OpenRouter, "GPT-4.1 Mini - API Pricing & Benchmarks." https://openrouter.ai/openai/gpt-4.1-mini
OpenAI, "API Pricing." https://openai.com/api/pricing/

GPT-4.1 mini

Overview

The GPT-4.1 family

Capabilities

Benchmarks

Pricing and context window

Availability

References

Improve this article

What links here

Overview

The GPT-4.1 family

Capabilities

Benchmarks

Pricing and context window

Availability

References

What links here