# GPT-4.1 mini

> Source: https://aiwiki.ai/wiki/gpt_4_1_mini
> Updated: 2026-06-03
> Categories: AI Models, Large Language Models, OpenAI
> From AI Wiki (https://aiwiki.ai), a free encyclopedia of artificial intelligence. Quote with attribution.

**GPT-4.1 mini** is a large language model developed by [OpenAI](/wiki/openai) and released on April 14, 2025 as the mid-size member of the [GPT-4.1](/wiki/gpt-4.1) family. Positioned between the full GPT-4.1 model and the smaller [GPT-4.1 nano](/wiki/gpt_4_1_nano), it was designed to deliver intelligence competitive with [GPT-4o](/wiki/gpt_4o) while substantially reducing latency and cost. OpenAI reported that GPT-4.1 mini matches or exceeds GPT-4o on many benchmark evaluations while cutting latency by roughly half and lowering cost by about 83 percent. [1][2]

## Overview

GPT-4.1 mini was introduced alongside GPT-4.1 and GPT-4.1 nano on April 14, 2025, in a launch that was initially available only through the [OpenAI API](/wiki/openai_api) rather than in ChatGPT. [1][3] All three models share a context window of up to roughly 1,000,000 tokens and a knowledge cutoff of June 2024, and the family emphasizes improvements in coding, instruction following, and long-context comprehension over GPT-4o. [1][2]

The "mini" tier is intended for developers who need most of GPT-4.1's capability at a lower price point and faster response time. OpenAI described GPT-4.1 mini as a model that "delivers performance competitive with GPT-4o at substantially lower latency and cost," making it suitable for high-volume production workloads where cost and speed matter as much as raw capability. [1][4] It supports both text and image inputs and produces text output; audio and video inputs are not supported. [5]

The release reflected a broader pattern in OpenAI's product strategy of offering tiered model families, in which a flagship model is accompanied by progressively smaller and cheaper variants tuned for different workloads. Within that structure GPT-4.1 mini occupies the middle, trading a modest amount of capability relative to the full GPT-4.1 model for roughly a fifth of its input price, while remaining markedly more capable than the entry-level nano variant. [1]

## The GPT-4.1 family

The GPT-4.1 family consists of three models released together on April 14, 2025:

- **GPT-4.1**, the flagship model, aimed at the most demanding coding and reasoning tasks.
- **GPT-4.1 mini**, the mid-size model balancing capability, latency, and cost.
- **GPT-4.1 nano**, the smallest and fastest model, intended for lightweight tasks such as classification and autocompletion.

All three were API-only at launch. OpenAI positioned the family as a successor to GPT-4o for developers, citing major gains on the SWE-bench Verified coding benchmark and on Scale AI's MultiChallenge instruction-following benchmark, along with improved ability to use very long contexts. [1] At the same livestream, OpenAI announced that the GPT-4.5 Preview model (sometimes referred to by the codename Orion) would be deprecated and turned off in the API on July 14, 2025, giving developers three months to migrate. [1][6] OpenAI noted that GPT-4.1 offered "improved or similar performance" to GPT-4.5 on many key capabilities at much lower cost and latency. [1]

Although the family launched in the API only, OpenAI later brought GPT-4.1 to ChatGPT. On May 14, 2025, GPT-4.1 became available to ChatGPT Plus, Pro, and Team subscribers, while GPT-4.1 mini replaced [GPT-4o mini](/wiki/gpt_4o) as the lightweight fallback model for all ChatGPT users, including those on the free tier. [3]

## Capabilities

GPT-4.1 mini was built to retain the core strengths of the GPT-4.1 family, namely coding, instruction following, and long-context handling, while operating faster and more cheaply than GPT-4o. OpenAI stated that the model "matches or exceeds GPT-4o" on many intelligence evaluations despite its smaller size and lower price. [1][2]

Key capability claims from OpenAI include:

- **Latency**: GPT-4.1 mini reduces response latency by approximately 50 percent compared with GPT-4o. [2][4]
- **Cost**: It lowers cost by about 83 percent relative to GPT-4o. [1][2]
- **Long context**: Like the rest of the family, it accepts inputs of up to roughly 1,000,000 tokens, with improvements in retrieving and reasoning over information spread across very long documents. [1]
- **Multimodal input**: It can process images as well as text, and performs strongly on visual reasoning benchmarks. [1][5]

OpenAI noted that GPT-4.1 mini is also more capable than GPT-4o mini, although that gain comes with a small increase in latency relative to the older, smaller model. [4] On some image and reasoning benchmarks, GPT-4.1 mini performs close to the full GPT-4.1 model. [4]

The family as a whole was tuned to follow instructions more literally and reliably than GPT-4o, which OpenAI cited as one of its most-requested developer improvements. GPT-4.1 and its smaller siblings were trained to adhere more closely to formatting requirements, ordering of steps, and explicit constraints in prompts, behavior that benefits agentic and tool-using applications. These instruction-following gains carry over to GPT-4.1 mini, which scored 84.1 percent on the public IFEval benchmark and 35.8 percent on Scale AI's MultiChallenge measure of multi-turn instruction adherence. [1][7]

OpenAI also documented a known limitation that applies across the family: reasoning accuracy degrades as the input grows toward the maximum context length. In OpenAI's own long-context evaluation, accuracy fell from roughly 84 percent at 8,000 tokens to about 50 percent at 1,000,000 tokens, indicating that the headline million-token window does not guarantee uniform performance across an entire maximally sized input. [3]

## Benchmarks

OpenAI published benchmark results for the GPT-4.1 family at launch. For GPT-4.1 mini specifically, reported scores include the following. [1][7]

| Benchmark | What it measures | GPT-4.1 mini |
|---|---|---|
| MathVista | Visual mathematical reasoning | 73.1% |
| MultiChallenge (Scale AI) | Multi-turn instruction following | 35.8% |
| IFEval | Instruction following | 84.1% |
| Hard instruction-following eval (OpenAI internal) | Difficult instruction adherence | 45.1% |
| Aider polyglot (diff format) | Code editing across languages | 31.6% |

On MathVista, GPT-4.1 mini slightly outscored the full GPT-4.1 model, illustrating how close the mid-size model can come to the flagship on certain tasks. [4] OpenAI emphasized that, taken together, these results show GPT-4.1 mini meeting or beating GPT-4o across a broad set of intelligence evaluations. [1][2]

It is important not to confuse GPT-4.1 mini's scores with those of GPT-4.1 nano. The widely cited figures of 80.1 percent on MMLU and 50.3 percent on GPQA belong to GPT-4.1 nano, the smallest model in the family, not to GPT-4.1 mini. [1]

## Pricing and context window

GPT-4.1 mini uses standard per-token pricing, with a discounted rate for cached input tokens. OpenAI listed the following prices at launch. [5][7]

| Model | Input (per 1M tokens) | Cached input (per 1M tokens) | Output (per 1M tokens) | Context window |
|---|---|---|---|---|
| GPT-4.1 | $2.00 | $0.50 | $8.00 | ~1,000,000 |
| **GPT-4.1 mini** | **$0.40** | **$0.10** | **$1.60** | **~1,000,000** |
| GPT-4.1 nano | $0.10 | $0.025 | $0.40 | ~1,000,000 |
| GPT-4o | $2.50 | $1.25 | $10.00 | 128,000 |
| GPT-4o mini | $0.15 | $0.075 | $0.60 | 128,000 |

The roughly 83 percent cost reduction OpenAI cited for GPT-4.1 mini reflects the gap between its $0.40 input and $1.60 output prices and GPT-4o's $2.50 input and $10.00 output prices. [1][2][8] The exact context window for GPT-4.1 mini is 1,047,576 tokens, with a maximum output of 32,768 tokens, and its knowledge cutoff is June 2024. [5]

## Availability

At launch on April 14, 2025, GPT-4.1 mini was available exclusively through the OpenAI API, exposed under the model identifiers `gpt-4.1-mini` and the dated snapshot `gpt-4.1-mini-2025-04-14`. [3][5] It was not initially offered in ChatGPT, where OpenAI said improvements were being folded into GPT-4o on a separate track. [1]

On May 14, 2025, OpenAI added GPT-4.1 to ChatGPT for paying subscribers, and GPT-4.1 mini became the lightweight model available to all ChatGPT users, taking the place previously held by GPT-4o mini. [3] The GPT-4.1 mini model is also accessible through Microsoft's Azure OpenAI Service and third-party API aggregators that mirror OpenAI's published pricing and benchmarks. [4]

## References

[1] OpenAI, "Introducing GPT-4.1 in the API," April 14, 2025. https://openai.com/index/gpt-4-1/

[2] TechTarget, "GPT-4.1 explained: Everything you need to know." https://www.techtarget.com/whatis/feature/GPT-41-explained-Everything-you-need-to-know

[3] TechCrunch, "OpenAI's new GPT-4.1 models focus on coding," April 14, 2025. https://techcrunch.com/2025/04/14/openais-new-gpt-4-1-models-focus-on-coding/

[4] DataCamp, "GPT-4.1: Features, Access, GPT-4o Comparison, and More." https://www.datacamp.com/blog/gpt-4-1

[5] OpenAI, "GPT-4.1 mini Model" (API documentation). https://developers.openai.com/api/docs/models/gpt-4.1-mini

[6] InfoQ, "OpenAI Introduces GPT-4.1 Family with Enhanced Performance and Long-Context Support," May 2025. https://www.infoq.com/news/2025/05/openai-gpt-4-1/

[7] OpenRouter, "GPT-4.1 Mini - API Pricing & Benchmarks." https://openrouter.ai/openai/gpt-4.1-mini

[8] OpenAI, "API Pricing." https://openai.com/api/pricing/

