Claude 3 Haiku
Last reviewed
Jun 3, 2026
Sources
10 citations
Review status
Source-backed
Revision
v1 · 1,263 words
Improve this article
Add missing citations, update stale details, or suggest a clearer explanation.
Last reviewed
Jun 3, 2026
Sources
10 citations
Review status
Source-backed
Revision
v1 · 1,263 words
Add missing citations, update stale details, or suggest a clearer explanation.
Claude 3 Haiku is the fastest and most compact model in the Claude 3 family, a generation of large language models released by Anthropic in March 2024. It was positioned as the company's lowest-cost option, built for high-volume, latency-sensitive work such as customer support, content moderation, and information extraction, and it sat below Claude 3 Sonnet and Claude 3 Opus in the lineup's three-tier structure of capability and price. [1][2]
Anthropic announced the Claude 3 family on March 4, 2024, presenting three models named, in ascending order of capability, Haiku, Sonnet, and Opus. Opus and Sonnet became available immediately through the company's API and on the Claude consumer interface, while Haiku was described at launch as arriving later. [3][2] Haiku reached general availability on March 13, 2024, offered through the Claude API and to Claude Pro subscribers on Claude.ai, with availability on Amazon Bedrock and, subsequently, Google Cloud's Vertex AI. [1] The API model identifier carries the snapshot date March 7, 2024, written as claude-3-haiku-20240307. [4]
The naming convention drew on the syllabic structure of haiku poetry to signal the smallest and quickest member of the family. The same generation introduced the size-tier scheme that Anthropic carried into later releases, with Haiku denoting the low-latency, low-cost tier. The Claude 3 models followed the company's earlier Claude Instant and Claude 2 systems, and Haiku was the direct successor to the role Claude Instant had played as the fast, inexpensive choice. [2]
Claude 3 Haiku is a multimodal model that accepts both text and image input and returns text output. Anthropic described it as having strong vision capabilities for its class, able to read photographs, charts, graphs, and technical diagrams. [1][2] The model has a context window of 200,000 tokens and a maximum output of 4,096 tokens, with a training knowledge cutoff of August 2023. [4][5]
Speed was the model's defining characteristic. Anthropic said Haiku was three times faster than peer models for the majority of workloads, processing 21,000 tokens, roughly 30 pages, per second for prompts under 32,000 tokens. The company noted that prompts longer than 32,000 tokens could ingest 30 to 60 percent more slowly. As an illustration of throughput, Anthropic stated that Haiku could read a data-dense research paper from arXiv of about 10,000 tokens, including its charts and graphs, in under three seconds. [1]
The pairing of speed with low cost was the basis for Anthropic's enterprise pitch. The company gave the example that Haiku could process and analyze 400 Supreme Court cases or 2,500 images for one US dollar, framing the model as suited to tasks where volume and responsiveness mattered more than the deepest reasoning. [1]
In the Claude 3 model card, Anthropic reported Haiku's results on standard evaluations. The model scored well for its size class on knowledge and reasoning tests and on code generation, though it trailed the larger Sonnet and Opus models in the same family. The figures below are drawn from Anthropic's reported results. [6]
| Benchmark | Setup | Claude 3 Haiku |
|---|---|---|
| MMLU (general knowledge) | 5-shot | 75.2% |
| GSM8K (grade-school math) | 0-shot chain of thought | 88.9% |
| HumanEval (code) | 0-shot | 75.9% |
| MATH (competition math) | 0-shot | 38.9% |
Anthropic presented these scores alongside the higher results for Sonnet and Opus to show the trade-off across the family: each step up the tier raised accuracy at the cost of speed and price. [6] Independent benchmarking services later ranked Claude 3 Haiku at the lower end of contemporary models as newer and stronger small models appeared, which is consistent with its early-2024 release and its position as the most compact option of its generation. [7]
At launch Anthropic priced Claude 3 Haiku well below the rest of the family, and the rate held throughout the model's life. The 1:5 ratio between input and output pricing was typical of the Claude 3 generation. [1][2]
| Attribute | Value |
|---|---|
| Input price | $0.25 per million tokens |
| Output price | $1.25 per million tokens |
| Context window | 200,000 tokens |
| Maximum output | 4,096 tokens |
| Input modalities | Text, image |
| Output modality | Text |
| Knowledge cutoff | August 2023 |
| General availability | March 13, 2024 |
Beyond the first-party API, the model was distributed through Amazon Bedrock and Google Cloud's Vertex AI, where it appeared under the identifier claude-3-haiku-20240307. [4] The low price made it the cheapest route to Claude-grade image understanding for a period, a point that remained relevant after later Haiku releases (see below). [8]
Anthropic aimed Claude 3 Haiku at applications that send large numbers of requests and need quick replies. The company named customer-facing interactions, content moderation, inventory management, data extraction from unstructured documents, and rapid translation as representative workloads. [1] Its combination of a large context window with low per-token cost also suited bulk classification, tagging, and summarization pipelines, where a single document or transcript could be processed cheaply at scale. The image support extended these uses to visual material such as scanned documents, receipts, and diagrams. [1][2]
Because Haiku was the inexpensive tier, developers frequently used it as the default model for routine traffic and reserved Sonnet or Opus for harder queries, an arrangement sometimes called model routing. The pricing example of analyzing thousands of images per dollar reflected this intended pattern of high-volume, low-margin use. [1]
Anthropic announced Claude 3.5 Haiku on October 22, 2024, with general availability in early November 2024. The company said the new model matched the speed of Claude 3 Haiku while improving across skills, and that it surpassed Claude 3 Opus, the prior flagship, on several benchmarks. [9][8] Claude 3.5 Haiku launched as a text-only model; image input was added on February 25, 2025, which for a time left Claude 3 Haiku as Anthropic's least expensive model that accepted images. [8]
Pricing for the 3.5 model started higher than its predecessor at $1 per million input tokens and $5 per million output tokens, four times the Claude 3 Haiku rate. Anthropic reduced that to $0.80 input and $4 output per million tokens on December 5, 2024. [8] The naming shifted to a size-then-version style with later models such as Claude Haiku 4.5 (see Claude Haiku 4.5), the recommended successor once the earlier Haiku models were retired. The broader 3.5 generation also included an upgraded Claude 3.5 Sonnet. [9]
Anthropic placed the original Claude 3 Haiku on its deprecation schedule and retired the claude-3-haiku-20240307 endpoint in April 2026, after which calls to that identifier returned errors rather than routing to a replacement. The company directed affected developers to its newer Haiku models. [10]