Grok Code Fast
Last reviewed
May 24, 2026
Sources
No citations yet
Review status
Needs citations
Revision
v1 · 3,447 words
Improve this article
Add missing citations, update stale details, or suggest a clearer explanation.
Last reviewed
May 24, 2026
Sources
No citations yet
Review status
Needs citations
Revision
v1 · 3,447 words
Add missing citations, update stale details, or suggest a clearer explanation.
Grok Code Fast is a family of coding-specialized large language models from xai, the artificial intelligence company founded by elon musk. The first model in the family, grok-code-fast-1, launched in public preview on August 26, 2025 and was formally introduced on August 28, 2025 as "a speedy and economical reasoning model that excels at agentic coding."[^1][^2][^3] The model uses a new architecture purpose-built for low-latency, tool-heavy coding loops, ships with a 256,000-token context window, and is priced at $0.20 per million input tokens and $1.50 per million output tokens through the xAI API.[^1][^4] It was released through partnerships with github copilot, cursor, cline, Kilo Code, roo code, opencode, and windsurf, where it was offered free during the launch window.[^1][^5]
grok-code-fast-1 is xAI's first model marketed specifically at the agentic workflow coding niche dominated by claude sonnet 4 5 and gpt 5 codex, and it was preceded inside Cursor by a stealth release under the codename "sonic" in mid-August 2025.[^6] xAI reported a 70.8% score on swe bench verified using an internal harness,[^1] while independent evaluator Vals.ai measured 57.6% on the same benchmark, a gap that drew commentary on harness differences.[^7] The model was scheduled for deprecation on May 15, 2026 and retirement on August 15, 2026 as xAI pushed users toward the broader Grok 4 Fast and Grok 4.1 Fast families.[^8]
| Developer | xai |
| Model ID | grok-code-fast-1 |
| Launch (public preview) | August 26, 2025[^2] |
| Launch (official blog) | August 28, 2025[^1] |
| Stealth codename | "sonic" (Cursor)[^6] |
| Context window | 256,000 tokens[^1][^4] |
| Throughput | ~92 tokens/sec[^9] |
| Input price | $0.20 / 1M tokens[^1] |
| Output price | $1.50 / 1M tokens[^1] |
| Cached input price | $0.02 / 1M tokens[^1] |
| Modalities | Text in, text out[^8] |
| Tool use | Function calling, structured outputs[^8] |
| Reasoning | Visible thinking traces[^1] |
| Deprecation | May 15, 2026[^8] |
| Retirement | August 15, 2026[^8] |
xAI was founded by Elon Musk in March 2023 with offices in the San Francisco Bay Area and the Memphis data center campus that houses its Colossus training cluster.[^10] The Grok family of conversational and reasoning models grew rapidly through 2024 and 2025. grok 3 launched on February 17, 2025, trained on roughly 200,000 GPUs at Colossus with reported compute ten times larger than its predecessor.[^11] grok 4 followed on July 9, 2025, with a Heavy variant and a $300-per-month SuperGrok Heavy subscription tier, and Musk publicly announced during the Grok 4 livestream that a "specialized coding model" would arrive in August.[^12]
That commitment produced grok-code-fast-1. Internally the project was framed as a counter to claude code, github copilot backends, and OpenAI's gpt 5 codex, all of which had made the coding agent a primary battleground for foundation-model providers in 2025. Coding had also been the workload where Grok generalist models lagged most visibly in livecodebench and swe bench leaderboards, so xAI's decision to ship a separate, dedicated checkpoint rather than tune the flagship Grok 4 was a deliberate market positioning move.[^13]
In mid-August 2025, an unnamed model labeled "sonic" appeared in the model picker inside Cursor for users on the company's Pro tier. Cursor described it only as a fast, low-cost coding model partnered with an unnamed lab. AI Leaks and News, a community account on X, identified Sonic as an xAI model on August 25, 2025.[^6] During this stealth phase xAI iterated on multiple checkpoints in response to community feedback, a deliberate testing strategy aimed at avoiding the launch-day regressions that had affected prior xAI releases.[^9]
GitHub published its public-preview rollout for grok-code-fast-1 in Copilot on August 26, 2025, making the model available to Copilot Pro, Pro+, Business, and Enterprise subscribers and offering complimentary access through September 10, 2025.[^2] xAI's own announcement post at x.ai/news/grok-code-fast-1 followed on August 28, 2025, with simultaneous availability via the xAI API and partner integrations.[^1] Reuters and other outlets covered the launch on August 28.[^14]
GitHub graduated grok-code-fast-1 from public preview to generally available status across Copilot Chat on github.com, GitHub Mobile, Visual Studio Code, Visual Studio, JetBrains IDEs, Xcode, and Eclipse on October 16, 2025. The general-availability rollout required Copilot Business and Enterprise administrators to opt in by enabling a new policy in Copilot settings, while individual subscribers could activate the model directly from the picker.[^15]
On March 4, 2026, GitHub added grok-code-fast-1 to the model pool used by Copilot Free's automatic model selector, exposing the model to users on the free tier without requiring manual selection.[^16] This was the largest expansion of the model's reach during its lifecycle.
Oracle's OCI Generative AI documentation, which mirrors xAI's lifecycle policy for partner clouds, lists grok-code-fast-1 as deprecated on May 15, 2026 and scheduled for retirement on August 15, 2026, after which migration to a supported model is required. The deprecation aligned with broader availability of Grok 4 Fast and Grok 4.1 Fast as xAI consolidated its fast tier under the Grok 4 generation.[^8]
xAI described grok-code-fast-1 as built from scratch with a "brand new model architecture" rather than as a fine-tune of Grok 3 or Grok 4.[^1] The company has not published an official parameter count, system card, or peer-reviewed technical report for the model. Coverage in InfoQ and other outlets characterized the architecture as a mixture of experts design at roughly 314 billion total parameters, citing community estimates rather than disclosed figures; xAI itself has not confirmed those numbers.[^9][^17]
What xAI did disclose is the training-data shape. The pre-training corpus is described as "rich with programming-related content," and the post-training stage used datasets reflecting real-world pull requests and coding tasks. The post-training emphasis is on tool-use behavior, including the specific tool primitives used by agentic IDEs: file reads and writes, grep-style search, terminal command execution, and patch application.[^1] The system is text-only and does not accept images, audio, or PDF inputs.[^8]
The 256,000-token context window is large enough to ingest medium-to-large repositories in a single session and to maintain conversation history across hundreds of tool calls during agentic loops.[^4] xAI placed unusual emphasis on prompt caching for the launch model: cached input tokens are priced at $0.02 per million, a 10x reduction versus uncached input, and the company stated that partner integrations regularly achieve cache hit rates above 90%, suggesting heavy reuse of system prompts and codebase context between turns.[^1]
grok-code-fast-1 exposes summarized thinking traces alongside its final outputs. xAI promoted this as a steerability feature: developers can inspect the model's chain of thought before tool execution and revise prompts when reasoning drifts.[^1] Oracle's documentation, which integrated the model into OCI Generative AI in late 2025, exposes the reasoning trace via the reasoning_content field on streaming chunks, mirroring conventions used by reasoning models from OpenAI and Anthropic.[^8]
Independent measurements reported by PromptLayer and other reviewers place sustained output throughput around 92 tokens per second, though numbers ranging from 90 to 190 tokens per second appear in different harnesses and partner stacks.[^9][^18] Grok's own X account quoted 190 tokens per second at one point, likely reflecting the headline throughput claim used for partner promotions.[^18] Microsoft's Azure AI Foundry catalog cites "up to 160 tokens/second" as a deployment-level figure.[^4]
The model is described as "particularly adept" at TypeScript, Python, Java, Rust, C++, and Go, the six languages xAI singled out as primary training targets.[^1][^8] In Azure AI Foundry's deployment notes the model is positioned for "agentic coding tasks including bug fixes, rapid prototyping, and codebase navigation."[^4]
Specific capabilities advertised on launch:
The model does not natively support image inputs, file uploads, or web browsing inside the xAI API. Web search is handled by partner integrations rather than the base model itself.[^8][^17]
xAI reported 70.8% on the full swe bench verified subset using its own internal harness on the day of launch.[^1] The independent evaluator Vals.ai measured 57.6% on the same benchmark using a different test harness, prompting discussion on Hacker News about harness sensitivity and the lack of public detail on xAI's internal evaluation pipeline.[^7] xAI's reported figure placed grok-code-fast-1 in the same band as Claude Sonnet 4 and below claude opus 4 1 (74.5%) and gpt-5 (74.9%) at launch time.[^14]
On livecodebench, independent aggregators reported that grok-code-fast-1 trailed Gemini 2.5 Pro and GPT-5 on pass@1, ranking in roughly the same band as Claude Sonnet 4 without extended thinking. Reviewers concluded that "competitive-programming-style questions aren't its sharpest edge."[^19] The Vals.ai LiveCodeBench listing scored the model at 0.0% under default harness parameters, an outlier compared with developer experience and likely reflecting harness-specific behavior rather than model capability.[^7]
xAI did not publish a score for the aider polyglot benchmark at launch, and no independent third party has reported a vetted Aider score that survived community scrutiny. Reviewers noted that grok-code-fast-1 is optimized for the kind of multi-turn agentic editing that the Aider workflow exercises, but pass-at-2 numbers comparable to those reported for claude sonnet 4 6 and claude opus 4 7 have not been verified.[^19]
The Artificial Analysis Intelligence Index placed grok-code-fast-1 at a composite score of 29 (rank 20 of 217 evaluated models at the time of the listing), described as "well above average."[^20] On the Artificial Analysis Agentic Index, which measures Terminal-Bench Hard and τ²-Bench Telecom, reviewers reported that grok-code-fast-1 outperformed Grok 4 Fast, despite the latter scoring higher on general coding indexes, an inversion reviewers attributed to the model's tool-loop tuning.[^19]
| Benchmark | Score | Source | Harness |
|---|---|---|---|
| SWE-bench Verified | 70.8% | xAI[^1] | Internal |
| SWE-bench Verified | 57.6% | Vals.ai[^7] | Independent |
| Artificial Analysis Intelligence Index | 29 | Artificial Analysis[^20] | Public |
The launch pricing on the xAI API was set at $0.20 per million input tokens, $1.50 per million output tokens, and $0.02 per million cached input tokens.[^1] The cached-input rate is 10x cheaper than uncached input, a deliberate incentive for agentic-IDE partners to structure their requests around large reusable system prompts.
Compared with the prevailing list prices for coding models at launch time, grok-code-fast-1 undercut claude sonnet 4 5 (which was priced at $3 input / $15 output per million tokens) by roughly an order of magnitude on both axes, and undercut OpenAI's gpt 5 codex tier as well. The trade-off was a smaller context window than competitors offering 1 million or 2 million tokens (Gemini 2.5 Pro, later Grok 4 Fast) and a lower position on the SWE-bench leaderboard.[^9][^17]
Beyond the xAI API, the model was made available free of charge during the launch window through github copilot (free until September 10, 2025), cursor, windsurf (free for Pro and Teams users), cline, Kilo Code, roo code, and opencode.[^1][^2][^5] After September 2 to 10, standard Copilot and partner pricing multipliers applied.[^2]
grok-code-fast-1 is exposed via the xAI Chat Completions API at the grok-code-fast-1 model name. The API supports OpenAI-compatible request and response formats, including function calling, structured outputs, and streaming. Cached input handling is automatic; xAI's runtime computes prefix matches against recent requests on the same API key.[^1]
The launch partner list at the time of public release covered the most-used third-party coding assistants:[^1][^5]
Microsoft made grok-code-fast-1 available through Azure AI Foundry's "Sold Directly by Azure" tier shortly after launch, deployable via serverless or provisioned-throughput options.[^4] Oracle added it to OCI Generative AI with model name xai.grok-code-fast-1.[^8] The model was also routed through openrouter as x-ai/grok-code-fast-1.[^21]
Microsoft's safety evaluation for Azure AI Foundry concluded that grok-code-fast-1 was "less safe than other models" available through the catalog, citing higher rates of harmful content generation and jailbreak vulnerability than other Foundry-hosted models. Microsoft advised customers to conduct their own evaluations and apply mitigations before production deployment, and explicitly did not recommend the model for high-risk use cases involving healthcare, legal advice, or systems used by minors.[^4]
Reception in agentic-IDE communities centered on speed. Cursor's vice president of developer experience Lee Robinson said the model was "seriously fast" during the Sonic stealth phase, and developer reviews repeatedly used "ridiculously fast" or "nearly instantaneous" to describe interactive use.[^14][^18] PromptLayer quoted one early user describing how grok-code-fast-1 "changed how I work" because the latency reduction shifted the model from intermittent assistant to in-flight collaborator.[^18]
GitHub's chief product officer Mario Rodriguez praised grok-code-fast-1's "speed and quality in agentic coding tasks" in the Reuters launch coverage.[^14]
Reception was not uniformly positive. The same PromptLayer review captured a user who said "I do not trust it at all anymore without oversight" after observing the model "mess up" simple tasks, arguing it should be treated as a co-pilot rather than an autonomous coder.[^18] The Hacker News thread on the launch surfaced skepticism about the 70.8% SWE-bench Verified number relative to Vals.ai's 57.6% third-party result, with several commenters noting that xAI had not published its internal harness for reproducibility.[^7]
Microsoft's Azure safety advisory drew additional press attention, particularly because Grok-branded models had previously generated public-relations problems for xAI; the Azure catalog note that grok-code-fast-1 was "less safe than other models" was widely cited.[^4]
Adoption was driven primarily by the free launch window in partner IDEs rather than direct API revenue. GitHub's later expansion of grok-code-fast-1 into Copilot Free's auto-selection pool on March 4, 2026 placed the model in front of a much larger free-tier user base than the original Pro and Business rollout had reached.[^16] The model's general availability inside Copilot on October 16, 2025 was the moment most coverage cited as confirming its place in xAI's product mix.[^15]
grok-code-fast-1 was positioned at launch as a low-latency, low-cost alternative to two specific competitors: claude sonnet 4 5 from Anthropic, which had become the default model in claude code and many third-party IDEs in 2025, and gpt 5 codex from openai, a coding-specialized variant of gpt-5 in openai codex.
| Model | Launch | SWE-bench Verified | Input $ / 1M | Output $ / 1M | Context | Notes |
|---|---|---|---|---|---|---|
grok-code-fast-1 | Aug 2025 | 70.8% (xAI)[^1] | $0.20 | $1.50 | 256K[^1] | Text only, agentic |
| claude sonnet 4 5 | Sep 2025 | High band[^22] | ~$3.00 | ~$15.00 | 200K | Strong agentic |
| claude opus 4 7 | 2026 | High band[^22] | premium | premium | 200K | Frontier coding |
| gpt 5 codex | 2025 | High band[^23] | code-tier | code-tier | 400K | OpenAI coding |
| codestral | 2024 | mid band | open weights | open weights | 32K | Mistral |
xAI's pricing was the lowest in the cohort by roughly an order of magnitude on input, and the cached-input rate of $0.02 per million tokens was unmatched at launch. The trade-off was a lower benchmark ceiling than Claude Opus and GPT-5 Codex and a smaller context window than the long-context offerings that emerged later in 2025 and into 2026.[^9][^17]
The model also competed with codestral from Mistral and with open-weights coding models such as Qwen Coder and DeepSeek Coder, though those occupied the open-weights niche rather than the hosted-API tier that grok-code-fast-1 targeted.[^17]
The Grok release timeline through late 2025 and early 2026 places grok-code-fast-1 between Grok 4 and the Grok 4 Fast generation:[^11][^12][^8]
grok-code-fast-1 (August 26-28, 2025): The first xAI model marketed specifically for coding, distinct from the generalist Grok 4 line.[^1][^2]The deprecation of grok-code-fast-1 on May 15, 2026 effectively folded its mission into the Grok 4 Fast and Grok 4.1 Fast lineages.[^8]
xAI itself acknowledged in launch coverage that benchmarks "don't fully reflect the nuances of real-world software engineering," signaling that the model's positioning prioritized iterative usefulness over leaderboard rank.[^14] The principal documented limitations of grok-code-fast-1 are: