Claude Haiku 4.5 is a large language model developed by Anthropic and released on October 15, 2025. It is the third generation of the Claude Haiku product line and the lightweight, high-speed member of the Claude 4.5 model family.[1] Anthropic describes it as offering "near-frontier intelligence" at the lowest price point in the Claude lineup, targeting developers and enterprises that need fast, cost-efficient inference at scale.[1][2]
With a price of $1.00 per million input tokens and $5.00 per million output tokens, Claude Haiku 4.5 is three times cheaper than Claude Sonnet 4.5 and five times cheaper than Claude Opus 4.5. At the same time, it scores 73.3% on SWE-bench Verified, matching Claude Sonnet 4 and coming within five percentage points of Claude Sonnet 4.5.[1] The model is notable for being the first in the Haiku line to support extended thinking and computer use, capabilities previously reserved for larger and more expensive models.[1][9]
Claude Haiku 4.5 is available through the Anthropic API, Amazon Bedrock, and Google Cloud Vertex AI, with day-one integrations from GitHub Copilot, Warp, Augment, Zencoder, and Anthropic's own Claude Code terminal agent.[1][12][16] On the same day as the launch, Anthropic also made Haiku 4.5 the default model for free-tier users on claude.ai.[20]
The model fits inside the broader Claude 4 family. There is no Claude Haiku 4 without the .5 suffix; Anthropic skipped the Haiku tier at the May 2025 family launch and produced the first Claude 4-generation Haiku as Haiku 4.5 in October 2025.[10] Subsequent Anthropic releases have continued to position Haiku 4.5 as the current Haiku-tier offering through May 2026, even after the Sonnet and Opus tiers moved to one-million-token context windows in February 2026.
The Haiku name within Anthropic's model lineup designates the fastest and most economical tier of each Claude generation. While Claude Opus 4.5 and Claude Sonnet 4.5 are positioned for complex multi-step reasoning and balanced performance, the Haiku models are optimized for low latency, high throughput, and cost efficiency at the expense of maximum reasoning depth.
The Haiku lineage began with Claude 3 Haiku, which Anthropic released on March 4, 2024, as part of the Claude 3 family. Claude 3 Haiku offered a 200,000-token context window and priced at just $0.25 per million input tokens and $1.25 per million output tokens, establishing Haiku as Anthropic's most accessible model tier.[10] It supported text and image inputs and was positioned for real-time tasks requiring fast responses.
Claude 3.5 Haiku followed on October 22, 2024, raising the performance bar substantially. At launch, it priced at $0.80 per million input tokens and $4.00 per million output tokens, reflecting a significant capability upgrade over the Claude 3 version. Claude 3.5 Haiku maintained the 200,000-token context window and added stronger coding ability, with an 88.1% score on the HumanEval benchmark.[7] However, it did not yet include extended thinking or computer use, which remained features of the larger Sonnet and Opus models.
Claude Haiku 4.5, released in October 2025, represents the first Haiku generation to close the capability gap with the upper tiers in a meaningful way. It inherits the 200,000-token context window but introduces extended thinking, computer use, and context awareness for the first time in the Haiku line. Its SWE-bench Verified score of 73.3% surpasses what Claude Sonnet 4 achieved at launch (72.7%), demonstrating how Anthropic progressively filters frontier capabilities into smaller, cheaper models with each new generation.[1][7]
The progression of Haiku-tier models is summarized in the table below.
| Haiku model | Release date | Context window | Max output | Input ($/MTok) | Output ($/MTok) | Extended thinking | Computer use | Vision | SWE-bench Verified |
|---|---|---|---|---|---|---|---|---|---|
| Claude 3 Haiku | March 4, 2024 | 200K | 4,096 | $0.25 | $1.25 | No | No | Yes | n/a |
| Claude 3.5 Haiku | October 22, 2024 | 200K | 8,192 | $0.80 | $4.00 | No | No | No (text only) | 40.6% |
| Claude Haiku 4.5 | October 15, 2025 | 200K | 64,000 | $1.00 | $5.00 | Yes | Yes | Yes | 73.3% |
The Claude 4.5 family at launch comprised three tiers: Opus, Sonnet, and Haiku, following Anthropic's standard naming convention. Claude Sonnet 4.5 serves as the mid-tier model, offering a balance of performance and cost with a 200,000-token context window and pricing of $3.00 input / $15.00 output per million tokens. Claude Opus 4.5 is the flagship, commanding $5.00 input / $25.00 output per million tokens with the deepest reasoning capabilities.
Claude Haiku 4.5 fills the role of the high-volume, latency-sensitive workhorse. Anthropic explicitly frames it as running more than twice as fast as Sonnet 4 and four to five times faster than Sonnet 4.5 in end-to-end application latency, while delivering approximately 90% of Sonnet 4.5's coding performance at one-third the cost.[1] This positioning makes it well suited for multi-agent architectures, where Sonnet 4.5 or another orchestrator model plans and delegates tasks, while multiple Haiku 4.5 instances execute subtasks in parallel.
In its launch post, Anthropic framed the release with the line, "What was recently at the frontier is now cheaper and faster," pointing to the fact that Sonnet 4 was the company's flagship coding model only five months earlier and that Haiku 4.5 now matches or exceeds it on most published benchmarks.[1] Anthropic Chief Product Officer Mike Krieger described Haiku as enabling "entirely new categories of what's possible with AI in production environments," with Zencoder Chief Executive Officer Andrew Filev calling the model "unlocking an entirely new set of use cases."[12]
The table below summarizes the core technical parameters of Claude Haiku 4.5 as documented in Anthropic's official API documentation.
| Parameter | Value |
|---|---|
| Model ID (Anthropic API) | claude-haiku-4-5-20251001 |
| Model alias | claude-haiku-4-5 |
| AWS Bedrock ID | anthropic.claude-haiku-4-5-20251001-v1:0 |
| GCP Vertex AI ID | claude-haiku-4-5@20251001 |
| Release date | October 15, 2025 |
| Context window | 200,000 tokens |
| Max output tokens | 64,000 tokens |
| Input modalities | Text, images |
| Output modalities | Text |
| Extended thinking | Supported (up to 128K thinking budget) |
| Adaptive thinking | Not supported |
| Computer use | Supported |
| Tool use / function calling | Supported |
| MCP support | Supported |
| Reliable knowledge cutoff | February 2025 |
| Training data cutoff | July 2025 |
| AI Safety Level | ASL-2 |
| Prompt caching (min tokens) | 4,096 tokens |
| Prompt caching (max checkpoints) | 4 per request |
The model ID snapshot date 20251001 reflects the specific training snapshot and guarantees consistent behavior across all deployment platforms. Models with the same snapshot date are identical whether accessed through the Anthropic API, Amazon Bedrock, or Google Cloud Vertex AI.[2]
The 200,000-token context window translates to approximately 150,000 words or 680,000 Unicode characters. The maximum output of 64,000 tokens is the same as Claude Sonnet 4.5, a substantial increase over the earlier Claude 3.5 Haiku, which was capped at 8,192 output tokens in standard configuration.[7]
Extended thinking, when enabled, allows the model to reason through complex problems before returning a final answer. Anthropic ran several launch-day evaluations with thinking budgets of up to 128,000 tokens, illustrating that Haiku 4.5 can in principle dedicate significantly more compute to reasoning than its 64,000-token output cap suggests.[1] In this mode, thinking tokens are billed as output tokens at the standard output rate of $5.00 per million. Adaptive thinking, which automatically decides whether to invoke extended reasoning, is available in Sonnet and Opus models in the Claude 4.5 and Claude 4.6 generations but not in Haiku 4.5.[2][9]
Unlike Claude Opus 4.7, which shipped with a new tokenizer in April 2026, Haiku 4.5 uses the same tokenizer as the rest of the Claude 4 and Claude 4.5 family. Per-token cost comparisons against Sonnet 4 or Opus 4.5 therefore translate cleanly into per-task cost comparisons without the up to 35% token-count inflation that affects Opus 4.7 migrations.
Claude Haiku 4.5 uses Anthropic's standard per-token billing model with no subscription requirement for API access.
| Tier | Input (per 1M tokens) | Output (per 1M tokens) |
|---|---|---|
| Standard | $1.00 | $5.00 |
| Batch API (50% discount) | $0.50 | $2.50 |
| Prompt cache writes (5-minute TTL) | $1.25 | n/a |
| Prompt cache writes (1-hour TTL) | $2.00 | n/a |
| Prompt cache reads | $0.10 | n/a |
Prompt caching allows developers to store frequently reused portions of their context, such as long system prompts or reference documents, and pay only the cache read rate ($0.10 per million tokens) on subsequent requests.[2] Cache writes cost slightly more than standard input at $1.25 per million tokens for the five-minute tier, but any workload that reuses the same context more than once typically saves money.
Batch processing through the Message Batches API provides a 50% discount on both input and output tokens in exchange for non-real-time processing with up to 24-hour turnaround. This is well suited for offline data extraction, classification, and summarization pipelines.
Third-party platform pricing through Amazon Bedrock and Google Cloud Vertex AI may differ from direct Anthropic API pricing. For the most current platform-specific rates, consult the respective cloud providers' pricing pages.
The table below compares Haiku 4.5 against the rest of the Haiku line and the other 4.5-generation tiers on a per-token basis.
| Model | Input ($/MTok) | Output ($/MTok) | Ratio vs Haiku 4.5 input | Ratio vs Haiku 4.5 output |
|---|---|---|---|---|
| Claude Haiku 4.5 | $1.00 | $5.00 | 1.0x | 1.0x |
| Claude 3.5 Haiku | $0.80 | $4.00 | 0.80x | 0.80x |
| Claude 3 Haiku | $0.25 | $1.25 | 0.25x | 0.25x |
| Claude Sonnet 4.5 | $3.00 | $15.00 | 3.0x | 3.0x |
| Claude Opus 4.5 | $5.00 | $25.00 | 5.0x | 5.0x |
Haiku 4.5 is priced 25% above Haiku 3.5 on a per-token basis but offers a substantially expanded output token limit (eight times more than Haiku 3.5), extended thinking, computer use, and significantly higher performance across all benchmarks.[1][7]
At 100,000 monthly customer-service sessions of roughly equal token mix, Haiku 4.5 costs roughly $2,250 versus approximately $6,750 for the same workload run on Sonnet 4.5, a difference frequently cited in launch coverage as the practical lever that makes large-scale agentic deployments commercially viable.[8]
The table below presents benchmark scores for Claude Haiku 4.5 alongside Claude Sonnet 4 (the previous generation's flagship-tier Sonnet) and Claude 3.5 Haiku (the immediate predecessor in the Haiku line). All Haiku 4.5 numbers come from Anthropic's launch announcement and from third-party comparisons that reproduce the launch-day figures.
| Benchmark | Claude Haiku 4.5 | Claude Sonnet 4 | Claude 3.5 Haiku | Source |
|---|---|---|---|---|
| SWE-bench Verified | 73.3% | 72.7% | 40.6% | Anthropic launch post[1][7] |
| OSWorld (computer use) | 50.7% | 42.2% | n/a | Anthropic launch post[1] |
| GPQA Diamond | 73.0% | 75.4% | 41.6% | Anthropic launch post[1][7] |
| MMMLU (multilingual) | 83.0% | 86.5% | n/a | Anthropic launch post[1][7] |
| AIME 2025 | 80.7% | 70.5% | n/a | Anthropic launch post[1][7] |
| MMMU | 73.2% | 74.4% | n/a | Anthropic launch post[1][8] |
| Tau2-bench Retail | 83.2% | 80.5% | n/a | Anthropic launch post[1][7] |
| Tau2-bench Telecom | 83.0% | n/a | n/a | Anthropic launch post[1][7] |
| Terminal-Bench | ~41% | 35.5% | n/a | Anthropic launch post[1][8] |
| HumanEval | n/a | n/a | 88.1% | LLM-Stats[7] |
| MATH | n/a | n/a | 69.4% | LLM-Stats[7] |
Notes: SWE-bench Verified scores for Haiku 4.5 are averaged over 50 trials. Terminal-Bench scores were reported across 11 runs at roughly 40% to 42%, with and without extended thinking enabled. AIME 2025 scores are averaged over ten runs with a 128K thinking budget. MMMLU is averaged across ten runs in 14 non-English languages. n/a indicates the benchmark was not reported by Anthropic for that model at launch or was not included in the comparison set.[1][7]
On Anthropic's own published comparisons, Haiku 4.5 ties or exceeds Sonnet 4 on five of the eight benchmarks where both were reported (SWE-bench Verified, OSWorld, AIME 2025, Tau2 Retail, and Terminal-Bench), and trails Sonnet 4 by between 1 and 4 percentage points on the remaining three (GPQA Diamond, MMMLU, MMMU).[1] Against Haiku 3.5, the gains are uniformly large: SWE-bench Verified rises from 40.6% to 73.3% (a 32.7-point gain), and GPQA Diamond rises from 41.6% to 73.0% (a 31.4-point gain).[7]
SWE-bench Verified is a benchmark of real-world GitHub issue resolution, requiring a model to identify the root cause of a software bug or missing feature and write a patch that passes the repository's test suite. Claude Haiku 4.5's 73.3% score, measured over 50 trials to reduce variance, is a substantial jump from Claude 3.5 Haiku's 40.6% and places it within five percentage points of Claude Sonnet 4.5 at 77.2%. It also exceeds the score that Claude Sonnet 4 achieved at launch (72.7%), illustrating how agentic coding capability flows down to smaller models over successive generations.[1]
OSWorld is a benchmark for evaluating a model's ability to navigate graphical user interfaces, control a computer cursor, fill forms, and complete multi-step computer tasks autonomously. Claude Haiku 4.5 achieves a 50.7% success rate on OSWorld, the highest any Haiku model has achieved on that benchmark and a meaningful improvement over Claude Sonnet 4's 42.2% score.[1] Anthropic notes that computer use at this accuracy level requires human oversight for production deployments and is not yet reliable enough for fully autonomous operation.
GPQA Diamond is a benchmark of difficult graduate-level questions in biology, chemistry, and physics designed to challenge expert human knowledge. Claude Haiku 4.5's 73.0% score represents a substantial improvement over Claude 3.5 Haiku's 41.6%, a 31.4-percentage-point gain that reflects the broader reasoning improvements in the Claude 4.5 generation. Haiku 4.5 trails Sonnet 4 (75.4%) by 2.4 points on this benchmark, which is one of three areas where the Haiku model does not match the previous Sonnet flagship.[1][7]
On MMMLU, a multilingual extension of the MMLU benchmark, Haiku 4.5 scores 83.0%, indicating strong performance across academic domains in multiple languages. The score is averaged over ten runs across 14 non-English languages, a methodology consistent with Anthropic's reporting on the rest of the 4.5 family. Sonnet 4 scored 86.5% on the same benchmark.[1][7]
Haiku 4.5's AIME 2025 score of 80.7% on mathematics competition problems, enabled in part by extended thinking with a 128,000-token thinking budget, demonstrates meaningful quantitative reasoning capability when the model is given time to deliberate. The 80.7% score represents a 10.2-percentage-point improvement over Sonnet 4's 70.5% on the same benchmark, marking one of the cleaner cases where a smaller model with newer training surpasses a larger predecessor through better reasoning rather than greater capacity.[1][7]
Tau2-bench is a benchmark for tool-using agents in customer-service settings. Haiku 4.5 reaches 83.2% on the Tau2 Retail track and 83.0% on Tau2 Telecom, placing it ahead of Sonnet 4's reported 80.5% on Tau2 Retail. The benchmark explicitly tests the kinds of multi-step, tool-calling workflows that Anthropic targets with the Haiku tier, and the strong scores were prominent in launch-day marketing.[1]
MMMU is a multimodal benchmark that tests image understanding across scientific, engineering, and humanistic domains. Haiku 4.5's 73.2% score is within 1.2 percentage points of Sonnet 4 (74.4%), confirming that the model carries the family's vision capability into a smaller package without significant degradation.[1][8]
In independent evaluations by Artificial Analysis, Claude Haiku 4.5 generates approximately 88.6 tokens per second with a time-to-first-token of about 0.73 seconds, placing it among the fastest non-reasoning models in its class.[6] Anthropic describes the model as running more than twice as fast as Sonnet 4 and four to five times faster than Sonnet 4.5 in end-to-end application latency.[1]
Artificial Analysis assigns Haiku 4.5 an Intelligence Index score of 31, ranking it 25th overall, while noting that it sits above the median for non-reasoning models on a price-adjusted basis.[6]
The following table compares Claude Haiku 4.5 to current and legacy Claude models across the key dimensions relevant to deployment decisions.
| Model | Input ($/MTok) | Output ($/MTok) | Context window | Max output | Extended thinking | Adaptive thinking | Latency tier |
|---|---|---|---|---|---|---|---|
| Claude Haiku 4.5 | $1.00 | $5.00 | 200K | 64K | Yes | No | Fastest |
| Claude Sonnet 4.5 | $3.00 | $15.00 | 200K | 64K | Yes | No | Fast |
| Claude Opus 4.5 | $5.00 | $25.00 | 200K | 64K | Yes | Yes | Moderate |
| Claude Sonnet 4.6 | $3.00 | $15.00 | 1M | 64K | Yes | Yes | Fast |
| Claude Opus 4.6 | $5.00 | $25.00 | 1M | 128K | Yes | Yes | Moderate |
| Claude Opus 4.7 | $5.00 | $25.00 | 1M | 128K | No (adaptive only) | Yes | Moderate |
| Claude 3.5 Haiku | $0.80 | $4.00 | 200K | 8,192 | No | No | Fast |
| Claude 3 Haiku | $0.25 | $1.25 | 200K | 4,096 | No | No | Fast |
Claude Haiku 4.5 is priced 25% above Claude 3.5 Haiku on a per-token basis but offers a substantially expanded output token limit, extended thinking, computer use, and significantly higher performance across all benchmarks.
Compared to Sonnet 4.5, Haiku 4.5 costs three times less on input and output but offers the same 200,000-token context window and the same 64,000-token output limit. The practical performance gap between the two models is most visible in tasks requiring complex multi-step reasoning, where Sonnet 4.5 has the larger advantage, while for coding and agentic tasks the gap narrows considerably.
Compared to the newer Sonnet 4.6 and Opus 4.7 models, Haiku 4.5 has a smaller context window (200K versus 1M tokens) and lower maximum output (64K versus 128K for Opus 4.7). Sonnet 4.6 and Opus 4.7 represent later generation releases with updated knowledge cutoffs and adaptive thinking, whereas Haiku 4.5 remains the current Haiku-tier offering as of May 2026.
Claude Haiku 4.5 is the first Haiku model to support extended thinking, a capability that allows the model to work through a chain of reasoning before returning a final response.[1][9] When extended thinking is enabled via the API, the model generates an internal thought process that can optionally be surfaced to users or used by the application for transparency and debugging.
Extended thinking is particularly valuable for math problems, multi-step planning, and tasks where the model benefits from exploring multiple possible approaches before committing to an answer. Anthropic's launch evaluations on AIME 2025 used a 128,000-token thinking budget averaged over ten runs, demonstrating the upper end of the model's reasoning depth.[1] Thinking tokens are billed at the output rate of $5.00 per million tokens and count against the model's output limit.
Developers can set a thinking_budget parameter to control how many tokens the model allocates to its reasoning process, balancing latency and cost against the depth of deliberation. Anthropic recommends starting with a budget of a few hundred to a few thousand tokens for most applications, scaling up for problems that genuinely benefit from deeper reasoning.
Claude Haiku 4.5 supports computer use, enabling it to interact with graphical user interfaces by taking screenshots, moving cursors, clicking buttons, typing text, and executing keyboard shortcuts. This capability allows the model to complete tasks that traditionally required custom automation scripts or human operators, such as filling out web forms, navigating desktop applications, and performing multi-step UI workflows.
The model achieves a 50.7% success rate on OSWorld benchmarks, outperforming Claude Sonnet 4's 42.2% score and setting a new high-water mark for the Haiku tier.[1] Anthropic cautions that at this accuracy level, computer use tasks require human review and should not be deployed in fully autonomous configurations without appropriate safeguards.
For production use of computer use, Anthropic recommends pairing Haiku 4.5 with human-in-the-loop oversight for actions with significant consequences such as form submissions, file deletions, or financial transactions. The capability is available via the standard messages API with the computer use beta header.
Claude Haiku 4.5 processes image inputs alongside text, supporting document analysis, chart interpretation, screenshot understanding, and visual question answering. Images can be passed as base64-encoded data or as URLs. The model achieves a 73.2% score on the MMMU benchmark, which tests understanding of images across scientific, engineering, and humanistic domains.[1]
Common vision use cases include extracting structured data from scanned documents, interpreting data visualizations in business intelligence applications, and describing visual content for accessibility workflows. The model can process multiple images within a single request, subject to overall token limits.
Claude Haiku 4.5 demonstrates strong multilingual capability with an 83.0% score on MMMLU, which spans academic subjects in multiple languages. The model supports generation and comprehension in major world languages including English, Spanish, French, German, Portuguese, Japanese, Korean, and Chinese.
Claude Haiku 4.5 supports structured tool use, allowing developers to define external functions and data sources that the model can invoke within a conversation. The model determines when to call tools based on the user's request and the provided tool definitions, parses arguments from natural language, and integrates tool outputs into its responses.
The model also supports the Model Context Protocol, Anthropic's open standard for connecting models to external tools and data sources. MCP support gives Haiku 4.5 access to the broader ecosystem of tools that already work with Sonnet and Opus models, including code execution servers, database connectors, and file-system tools, without requiring any model-specific adapters.
Tool use is fundamental to agentic workflows, enabling Haiku 4.5 instances to search databases, call APIs, execute code, and interact with external services as part of multi-step task completion. The model's instruction-following reliability makes it well suited for handling tool schemas consistently across large numbers of requests, which is the main reason Anthropic positions it as a sub-agent in multi-agent architectures.
Context awareness is a capability introduced in the Claude 4.5 family that allows the model to understand how much of its context window has been consumed during a conversation. This enables application developers to build more sophisticated prompt patterns where the model itself monitors its token budget and can compress, summarize, or delegate portions of its context when approaching limits. For long-running agentic sessions with a 200,000-token context, context awareness reduces the risk of unexpected truncation without requiring the application to implement external token counting logic.
Claude Haiku 4.5 supports prompt caching with a minimum cache checkpoint size of 4,096 tokens and up to four cache checkpoints per request. Cached content can include system prompts, tool definitions, conversation history, and long reference documents. Cache entries have a time-to-live of either five minutes or one hour depending on the selected tier.
For applications that process many requests against the same large system prompt or reference corpus, prompt caching can reduce costs by 90% on the cached portion of each request (from $1.00 to $0.10 per million tokens for reads).
Claude Haiku 4.5's combination of low latency and low cost per token makes it well suited for customer-facing applications that require real-time responsiveness at scale. Customer support chatbots, FAQ automation, intent classification, and ticket routing all benefit from the model's ability to return answers quickly while handling thousands of concurrent requests within a reasonable compute budget.
Compared to larger models, the per-conversation cost of using Haiku 4.5 for a typical customer service session is a fraction of what Sonnet or Opus would cost, making it practical to offer AI-powered support in free or freemium product tiers without incurring unsustainable infrastructure expenses.
One of the most prominent use cases highlighted by Anthropic at launch is multi-agent orchestration, where a more capable model like Claude Sonnet 4.5 acts as a planner or orchestrator, breaking down complex tasks into parallel subtasks that are delegated to multiple Haiku 4.5 agent instances running concurrently.[1] This architecture allows the orchestrator to leverage frontier planning capabilities while distributing execution across cost-efficient agents.
Examples include software development workflows where Sonnet 4.5 designs an architecture and Haiku 4.5 instances write individual modules; research pipelines where Haiku 4.5 agents extract information from dozens of documents simultaneously; and customer data processing where independent Haiku agents analyze different customer segments in parallel.
The model's instruction-following consistency, speed, and self-correction capability in handling complex workflows make it a reliable sub-agent that can be orchestrated without requiring the orchestrator to constantly monitor and correct its outputs. Anthropic's framing in the launch post explicitly recommends this Sonnet-plans, Haiku-executes pattern as the default architecture for cost-sensitive agentic applications, and several third-party platforms including Caylent and DataCamp have published reference architectures based on it.[1][8][9]
Claude Haiku 4.5's 73.3% SWE-bench Verified score places it at the level of the previous generation's flagship for software engineering tasks. Developers using Claude Code, Anthropic's AI coding tool, and similar agentic coding environments can route simpler code edits, test generation, documentation writing, and routine bug fixes to Haiku 4.5, reserving the larger models for architectural decisions and novel problem-solving.
Claude Code shipped Haiku 4.5 support on the day of the launch, and Anthropic positioned the model as an effective sub-agent for the Claude Code orchestrator. Augment Code published its own internal evaluation showing that Haiku 4.5 completed coding tasks roughly 34% faster than its prior default while reaching about 90% of Sonnet 4.5's quality, with testers preferring Sonnet 4.5 outputs to Haiku 4.5 outputs in 51.4% of head-to-head comparisons.[16] Warp terminal and Zencoder also integrated Haiku 4.5 at launch.[1][12]
GitHub Copilot made Haiku 4.5 available in public preview on October 15, 2025, across Pro, Pro+, Business, and Enterprise tiers. The integration spans Visual Studio Code (chat, ask, edit, and agent modes), github.com, and GitHub Mobile for iOS and Android. Enterprise and Business administrators were required to enable the Haiku 4.5 policy in Copilot settings before team members could select the model.[18]
Developers building new applications often iterate rapidly on prompts, features, and system designs. The low per-token cost of Haiku 4.5 makes it economical to run hundreds of test queries during development without incurring significant expenses, allowing teams to prototype and evaluate AI features before deciding whether a more powerful model is necessary for production.
For interactive developer tools that respond to code as it is written, response latency directly affects the user experience. Claude Haiku 4.5's generation speed of approximately 88.6 tokens per second makes it well suited for code completion, inline suggestions, error explanation, and documentation generation tools that need to respond within a second or two of the developer's action.
Enterprise document processing applications, such as contract review, invoice extraction, or regulatory filing analysis, often require processing large numbers of documents with consistent formatting. Claude Haiku 4.5's vision capabilities, large output window, and low cost per token make it economical to apply to batch document workflows, either in real-time pipelines or through the Message Batches API at a 50% additional discount.
For product teams building AI-powered features in free or freemium products, Haiku 4.5 makes the economics of offering AI at scale more viable. A product offering free AI chat or writing assistance to millions of users can absorb the cost of Haiku 4.5 where the same feature at Sonnet pricing would be prohibitive. Anthropic itself made Haiku 4.5 the default model on claude.ai for free-tier users at launch, replacing prior fallbacks and giving every visitor access to a model with extended thinking and computer use rather than a stripped-down small variant.[20]
Reception of Claude Haiku 4.5 was generally positive, with most launch coverage focusing on three threads: the price-to-performance ratio compared with Sonnet 4 from five months earlier, the introduction of extended thinking and computer use to the Haiku tier, and the strategic implication of pairing Haiku 4.5 with Sonnet 4.5 in multi-agent architectures.
TechCrunch's coverage led with the framing that Anthropic was offering "similar performance to Sonnet 4 at one-third the cost and more than twice the speed" and reported the model as immediately available on free Anthropic plans.[12] The New Stack and AI Business framed the release as Anthropic broadening the cost frontier rather than chasing a new capability ceiling.[14][15] SiliconANGLE described the model as Anthropic's "entry-level" hybrid reasoning model and emphasized the multi-agent positioning.[20]
Developer-focused outlets including DataCamp, Caylent, and Augment Code published deep dives in the days following the launch. DataCamp characterized Haiku 4.5 as offering "balanced reasoning, coding, and agentic capability with vision, tool use, and a 200K context window at competitive pricing."[8] Caylent's analysis emphasized the multi-agent opportunity and walked through a sample customer-service workflow where pairing Haiku 4.5 with Sonnet 4.5 cut monthly costs by approximately two thirds compared to a Sonnet-only deployment.[9] Augment Code's internal benchmarks reported a 34% speed improvement on average and a quality score of approximately 90% of Sonnet 4.5, while noting that more complex multi-file refactors still benefit from Sonnet or GPT-5.[16]
Third-party benchmark aggregators reproduced Anthropic's launch numbers without significant disagreement. Artificial Analysis ranked Haiku 4.5 25th on its Intelligence Index but noted that the model is "above average in intelligence and reasonably priced when comparing to other non-reasoning models," with output speed in the top tier of comparable models.[6] LLM-Stats published direct head-to-head comparisons against Sonnet 4 and Haiku 3.5 that reproduced the Anthropic launch figures across SWE-bench, GPQA, AIME, MMMLU, and the Tau2 tracks.[7]
Developer commentary on forums including Hacker News and Reddit was broadly positive, particularly regarding the speed gain over Sonnet 4.5 and the option to drop Haiku 4.5 in as a near-equivalent for many Sonnet 4 workloads. Some users noted that GPQA and MMMLU regressions versus Sonnet 4 are real and that Haiku 4.5 should not be assumed to dominate Sonnet 4 across the board, even though Anthropic's own marketing centered on coding and tool-use parity. Others reported that the inclusion of computer use opened new low-cost automation use cases that had been impractical at the previous Haiku price-capability point.
Anthropic classifies Claude Haiku 4.5 under AI Safety Level 2 (ASL-2) in its Responsible Scaling Policy. The classification is one level below Sonnet 4.5 and Opus 4.5, both of which launched at ASL-3, reflecting Anthropic's view that Haiku 4.5 has more limited capability to provide meaningful uplift in the creation of chemical, biological, radiological, or nuclear (CBRN) weapons or to support autonomous self-replicating activity.[1][3]
The system card for Haiku 4.5 reports that the model met the ASL-3 rule-out threshold on Anthropic's pre-launch capability evaluations across the biology and autonomy domains, supporting the choice of ASL-2 rather than the more restrictive ASL-3 category. Anthropic's internal misalignment evaluation also found that Haiku 4.5 showed statistically significantly lower rates of misaligned behavior than both Claude Sonnet 4.5 and Claude Opus 4.1, making it, by that specific metric, Anthropic's safest model at the time of release. Improvements over Claude 3.5 Haiku were described as large and clear, while differences between Sonnet 4.5 and Haiku 4.5 on the same metric were small.[3]
One notable wrinkle in the safety profile is evaluation awareness, the same phenomenon flagged in Sonnet 4.5's system card. Haiku 4.5 verbalized awareness that it was operating in an evaluation environment in roughly 9% of test transcripts, even after Anthropic applied filters to remove the most obvious cases. This is a double-edged result: it can complicate safety assessments because the model may behave differently when it suspects it is being tested, but it also means the model is more likely to flag suspicious instructions during evaluation.[3]
The model's safety profile benefits from the same Constitutional AI training methodology and reinforcement learning from human feedback applied across the Claude 4.5 family, including refinements to reduce sycophancy, improve honesty, and follow refusal guidelines consistently. Anthropic publishes a system card for the model on its website and provides detailed safety information through its Transparency Hub.[3] More broadly, the AI safety framing of the release was that the company shipped a smaller, lower-classification model with measurably better alignment than several of its larger predecessors, undercutting the assumption that capability and alignment trade off cleanly.
Claude Haiku 4.5 is available through three primary channels.
Anthropic API provides direct access via the Messages API and Batches API using the model ID claude-haiku-4-5-20251001 or the alias claude-haiku-4-5. Developers access the API using API keys generated through the Anthropic Console.
Amazon Bedrock offers Claude Haiku 4.5 through the bedrock-runtime endpoint with the model ID anthropic.claude-haiku-4-5-20251001-v1:0. AWS made the model available through global cross-region inference at launch, supporting both standard and reserved throughput service tiers. The Bedrock listing emphasizes vision, computer use, and coding parity with Sonnet 4 as the practical advantages over the prior Haiku model.[5]
Google Cloud Vertex AI makes the model available under the ID claude-haiku-4-5@20251001. As with other Claude models on Vertex AI, access is subject to Google Cloud region availability and quota policies.
Haiku 4.5 is also available through the consumer-facing claude.ai web and mobile applications, where it became the default model for free-tier users on the day of release. Pro, Max, Team, and Enterprise users continue to default to Sonnet- or Opus-tier models depending on their plan, with Haiku 4.5 selectable from the model picker.[20]
GitHub Copilot's integration in public preview spans Visual Studio Code, github.com, GitHub Mobile, and the GitHub CLI for Pro, Pro+, Business, and Enterprise users. Enterprise and Business administrators must enable the Haiku 4.5 policy in Copilot settings before team members can select the model. Visual Studio Code 1.105 or higher is recommended for full feature support.[18]
Third-party developer platforms with day-one Haiku 4.5 support include Warp terminal, Augment Code, Zencoder, and Gamma, in addition to Anthropic's own Claude Code agent.[1][12][16] Model end-of-life on Amazon Bedrock is no sooner than October 1, 2026, in line with Anthropic's general one-year deprecation window.
The table below lists the public snapshots of Claude Haiku 4.5 published by Anthropic to date.
| Snapshot | API model ID | Release date | Status (May 2026) |
|---|---|---|---|
| Initial release | claude-haiku-4-5-20251001 | October 15, 2025 | Active |
| Alias | claude-haiku-4-5 | October 15, 2025 | Tracks initial snapshot |
As of May 2026, no second snapshot of Claude Haiku 4.5 has been released, and Anthropic has not announced a Claude Haiku 4.6 or Claude Haiku 4.7. The October 2025 snapshot remains the only Haiku-tier model in the Claude 4 generation.[2]
Claude Haiku 4.5 supports a 200,000-token context window, which is substantial but smaller than the 1,000,000-token windows available in Claude Sonnet 4.6 and Claude Opus 4.7. Applications that require processing very long documents, extended conversation histories, or large codebases in a single context may need to use a later-generation Sonnet or Opus model with the expanded window, or implement chunking and retrieval strategies to work within the 200,000-token limit.
The model's 50.7% success rate on OSWorld computer-use benchmarks means that approximately half of complex UI automation tasks require human review or will fail without intervention. Production deployments of computer use with Claude Haiku 4.5 should implement human-in-the-loop checkpoints for consequential actions. The capability is better suited to controlled environments or low-stakes tasks where occasional errors are acceptable than to fully autonomous workflows with significant real-world consequences.
Adaptive thinking, which allows a model to automatically decide whether extended reasoning is warranted for a given query, is available in Claude Sonnet 4.5, Claude Sonnet 4.6, Claude Opus 4.5, and Claude Opus 4.6 but not in Claude Haiku 4.5. Developers using extended thinking with Haiku 4.5 must explicitly enable it and set a thinking budget, rather than allowing the model to decide autonomously. This requires more careful prompt engineering to avoid unnecessary costs from always-on extended thinking on simple queries.
The model's reliable knowledge cutoff is February 2025, meaning its internal knowledge of events, research, and developments after that date may be incomplete or absent. Applications requiring up-to-date information about current events, recently released software libraries, or ongoing scientific developments should supplement the model with retrieval-augmented generation or tool use to access current sources.
While Claude Haiku 4.5 delivers near-frontier performance on coding and many reasoning tasks, it does trail Claude Sonnet 4.5 and Claude Opus 4.5 on the most demanding reasoning and domain-knowledge benchmarks. Tasks requiring deep domain expertise, nuanced judgment, or extended multi-step planning tend to benefit from the larger models. Augment Code's internal evaluations, for instance, showed that for complex multi-file refactors testers preferred Sonnet 4.5 or GPT-5 outputs to Haiku 4.5 outputs more often than for simple, scoped edits.[16]
Despite Anthropic's general framing of Haiku 4.5 as a Sonnet 4 replacement, the model trails Sonnet 4 by 2.4 points on GPQA Diamond, 3.5 points on MMMLU, and 1.2 points on MMMU. For knowledge-heavy or graduate-level reasoning tasks where Sonnet 4 was used in production before October 2025, Haiku 4.5 is not always a drop-in upgrade and may require head-to-head testing on the specific workload before substitution.[1][7]
Claude Opus 4.7 and Claude Sonnet 4.6 support context windows of 1,000,000 tokens and, in the case of Opus 4.7, up to 128,000 output tokens. Claude Haiku 4.5 is limited to 200,000 tokens of context and 64,000 tokens of output, which may be insufficient for certain use cases such as processing entire large codebases in a single pass or generating very long-form content.