GPT-4.1 nano
Last reviewed
Jun 3, 2026
Sources
9 citations
Review status
Source-backed
Revision
v1 · 1,270 words
Improve this article
Add missing citations, update stale details, or suggest a clearer explanation.
Last reviewed
Jun 3, 2026
Sources
9 citations
Review status
Source-backed
Revision
v1 · 1,270 words
Add missing citations, update stale details, or suggest a clearer explanation.
GPT-4.1 nano is the smallest, fastest, and least expensive model in OpenAI's GPT-4.1 family of large language models, released on April 14, 2025. Offered exclusively through the OpenAI API at launch, GPT-4.1 nano was described by OpenAI as its fastest and cheapest model available at that time, optimized for low-latency tasks such as text classification and autocompletion. Despite its small size, it supports the same approximately one-million-token context window as the larger members of the family. [1][2]
GPT-4.1 nano was introduced alongside GPT-4.1 and GPT-4.1 mini on April 14, 2025, as the entry point of a three-model series aimed at developers. OpenAI positioned the family around three priorities: stronger coding ability, more reliable instruction following, and improved handling of very long inputs. Within that lineup, nano occupies the role of a high-throughput, cost-sensitive option intended for workloads where speed and price matter more than maximum reasoning depth. [1][3]
The model is multimodal on the input side, accepting both text and images while producing text output. It supports function calling, structured outputs, and streaming, and unlike OpenAI's "o-series" reasoning models it does not perform an explicit reasoning step before answering, which contributes to its low latency. [2]
GPT-4.1 nano is distinct from GPT-4o mini, the small model from the earlier GPT-4o generation that it was designed to outperform on standard benchmarks. It is also separate from the full GPT-4.1 and GPT-4.1 mini models, which trade higher cost for greater capability. Press coverage at launch framed the entire family, and nano in particular, as part of a broader competitive push to win over enterprise and developer customers with cheaper models, with several outlets noting that nano undercut OpenAI's own previous small models on price. [1][3]
The GPT-4.1 series consists of three models released simultaneously: GPT-4.1, the flagship; GPT-4.1 mini, a mid-sized model that OpenAI said matches or exceeds GPT-4o on many benchmarks at substantially lower latency and cost; and GPT-4.1 nano, the smallest and fastest of the three. All three were launched as API-only products and were not initially available inside ChatGPT. [1][3]
Every model in the family shares a context window of up to roughly one million tokens (1,047,576 tokens) and a knowledge cutoff of June 2024, a notable expansion over the 128,000-token windows of earlier GPT-4-class models. OpenAI framed the release around real-world developer needs, citing gains on software-engineering and instruction-following evaluations for the larger models. The flagship GPT-4.1, for example, scored 54.6% on SWE-bench Verified, a measure of coding ability, a substantial improvement over GPT-4o. [1][3]
The launch of GPT-4.1 coincided with OpenAI's decision to retire GPT-4.5 Preview from the API. OpenAI announced on April 14, 2025, that GPT-4.5 Preview would be deprecated, and the model was removed from the API on July 14, 2025, with OpenAI noting that GPT-4.1 offered similar or improved performance on many capabilities at much lower cost and latency. [4][5]
OpenAI reported that GPT-4.1 nano scores higher than GPT-4o mini across several standard evaluations despite its smaller footprint. The headline figures published for nano include 80.1% on MMLU (a broad knowledge and reasoning benchmark), 50.3% on GPQA Diamond (graduate-level science questions), and 9.8% on the Aider polyglot coding benchmark. On multimodal tests it scored 55.4% on MMMU and 56.2% on MathVista (image-based mathematical reasoning). [1][6]
| Benchmark | GPT-4.1 nano score |
|---|---|
| MMLU | 80.1% |
| GPQA Diamond | 50.3% |
| Aider polyglot (coding) | 9.8% |
| MMMU (multimodal) | 55.4% |
| MathVista (image math) | 56.2% |
The low Aider polyglot score reflects nano's positioning: it is not intended as a frontier coding model, a role reserved for the full GPT-4.1, which scored far higher on software-engineering tasks. Instead, nano's strengths lie in fast, inexpensive inference for narrower tasks. OpenAI specifically highlighted classification and autocompletion as well-suited use cases, where the combination of a large context window and minimal latency is more valuable than top-tier reasoning. Independent comparisons of low-cost language models likewise treated nano primarily as a latency-and-cost play rather than a reasoning leader, while acknowledging that its benchmark gains over GPT-4o mini made it an attractive default for high-volume, lightweight workloads. [1][6]
Because nano shares the family's expanded context window, it can also be applied to tasks that involve reading large documents end to end, although OpenAI's published long-context results emphasized the larger models. The headline benchmark figures for nano should be read as those of a deliberately small, distilled model: competitive with prior small models on knowledge and science questions, but not a substitute for the flagship on complex coding or multi-step reasoning. [1]
GPT-4.1 nano is priced as the cheapest model in the family. At launch it cost $0.10 per million input tokens and $0.40 per million output tokens, with cached input billed at a reduced rate of $0.025 per million tokens (a 75% discount on the input price). [1][2]
| Model | Input (per 1M tokens) | Cached input (per 1M tokens) | Output (per 1M tokens) |
|---|---|---|---|
| GPT-4.1 | $2.00 | (discounted) | $8.00 |
| GPT-4.1 mini | $0.40 | (discounted) | $1.60 |
| GPT-4.1 nano | $0.10 | $0.025 | $0.40 |
The model's context window is officially listed as 1,047,576 tokens, with a maximum output of 32,768 tokens per response. OpenAI's documentation lists June 1, 2024, as the knowledge cutoff. The original dated snapshot, identified as gpt-4.1-nano-2025-04-14, was the version released on launch day; OpenAI later marked that specific snapshot as deprecated while keeping the gpt-4.1-nano alias active. [2]
GPT-4.1 nano is aimed at applications that issue large volumes of requests and need fast, cheap responses. OpenAI cited classification and autocompletion as representative examples, and the broad context window also makes it suitable for tasks such as document re-ranking, lightweight extraction, routing, and processing long inputs where a more expensive model would be hard to justify economically. [1][2]
Because nano supports function calling and structured outputs, it can serve as a low-cost component inside larger agentic or pipeline-based systems, handling routine sub-tasks while more capable models such as GPT-4.1 or GPT-4.1 mini handle steps that demand stronger reasoning. Its image-input capability further allows it to be used for simple visual classification or extraction at scale. [2]
GPT-4.1 nano launched as an API-only model on April 14, 2025, and was not made directly selectable in the ChatGPT application. By contrast, the larger GPT-4.1 model was added to ChatGPT for Plus, Pro, and Team subscribers in May 2025, and GPT-4.1 mini replaced GPT-4o mini as the small model available to all ChatGPT users, including those on the free tier. The nano variant, however, remained an API offering rather than a consumer-facing ChatGPT model. [7][8]
Beyond OpenAI's own API, GPT-4.1 nano was also made available through partner platforms. Microsoft announced the GPT-4.1 model series, including nano, for Azure AI Foundry, and GPT-4.1 mini and nano became generally available in GitHub Models on the same day as the OpenAI launch. [9]