MAI-Thinking-1
Last reviewed
Jun 3, 2026
Sources
10 citations
Review status
Source-backed
Revision
v1 · 1,622 words
Improve this article
Add missing citations, update stale details, or suggest a clearer explanation.
Last reviewed
Jun 3, 2026
Sources
10 citations
Review status
Source-backed
Revision
v1 · 1,622 words
Add missing citations, update stale details, or suggest a clearer explanation.
MAI-Thinking-1 is a reasoning model developed by Microsoft AI, unveiled at Microsoft Build 2026 on June 2, 2026 as the company's first in-house flagship reasoning system. [1][2] Built by Microsoft's "MAI Superintelligence Team" under Mustafa Suleyman, the model is a 35-billion-active-parameter mixture-of-experts network with roughly one trillion total parameters and a 256,000-token context window. [1][3] Microsoft says it was trained from scratch on commercially licensed data with no distillation from other companies' models, a deliberate contrast with the way many rivals bootstrap new systems. [1][2] At launch it was offered in private preview on Microsoft Foundry, and Microsoft positioned it as central to a broader push to reduce its dependence on OpenAI. [2][4]
MAI-Thinking-1 was announced as the headline entry in a family of seven new first-party models released at Build 2026, alongside coding, image, voice, and transcription systems. [2][4] It is the first Microsoft model explicitly built for step-by-step chain-of-thought reasoning rather than the instruction-following and consumer Copilot scenarios that defined the company's earlier in-house releases. [1][5] Microsoft describes the target use cases as complex multi-step instructions, long-context reasoning, mathematical problem solving, and code generation. [1][3]
The model fits into a strategy Suleyman, the CEO of Microsoft AI, summarized at the event as "long term self-sufficiency for Microsoft and our partners." [4] For most of the past decade Microsoft built its AI products on top of OpenAI's models, later adding Anthropic as a second supplier. MAI-Thinking-1 is the clearest sign yet that the company intends to supply at least part of that demand from its own labs. [2][4] Microsoft renegotiated its OpenAI partnership in April 2026, ending Microsoft's exclusive license to OpenAI intellectual property, which removed one barrier to shipping competing first-party frontier models. [6]
MAI-Thinking-1 comes from the same Microsoft AI group that began publishing in-house models in 2025. In August 2025 the team released its first two: MAI-Voice-1, an efficient speech-generation model used in features such as Copilot Daily, and MAI-1-preview, a text foundation model aimed at instruction following and consumer Copilot interactions. [7][8] MAI-1-preview was trained on roughly 15,000 Nvidia H100 GPUs, a smaller training footprint than the headline figures cited by some competitors, and it was submitted for public benchmarking on platforms such as LMArena. [7]
The group was reorganized around an explicit goal of "humanist superintelligence," and the unit behind the Build 2026 releases is referred to as the MAI Superintelligence Team. [1][5] Suleyman, who co-founded DeepMind and later Inflection AI before joining Microsoft in 2024 to lead its consumer AI work, framed the team's output as state-of-the-art capability "explicitly designed to serve people and organizations, and not to replace them." [5][9] MAI-Thinking-1 represents the team's move up the capability ladder, from a voice model and a mid-tier consumer text model to a frontier-class reasoning system.
Microsoft describes MAI-Thinking-1 as a sparse mixture-of-experts model with 35 billion active parameters and approximately one trillion total parameters, meaning only a fraction of the network is engaged for any given token. [1][3] That design gives it a smaller inference footprint than dense models of comparable quality, which underpins Microsoft's repeated emphasis on cost efficiency. [3][10] The context window is 256,000 tokens, enough to process roughly a 600-page document in a single pass. [1]
| Attribute | Detail |
|---|---|
| Developer | Microsoft AI (MAI Superintelligence Team) |
| Type | Reasoning model (chain-of-thought) |
| Architecture | Sparse mixture-of-experts |
| Active parameters | 35 billion |
| Total parameters | ~1 trillion |
| Context window | 256,000 tokens |
| Distillation | None (trained from scratch) |
| API | Chat Completions compatible; function calling; developer instructions |
| Availability | Private preview on Microsoft Foundry; MAI Playground |
| Announced | June 2, 2026 (Microsoft Build) |
Source: Microsoft AI announcement and model page. [1][3]
On the developer side the model is compatible with the widely used Chat Completions API, and it supports function calling and multi-layered developer instructions. [1][3] Microsoft also said that, for the first time, developers would be able to tune the weights of one of its in-house models themselves, a capability aimed at enterprises that want a customized system without handing their workflows to a third party. [3][10]
The most heavily promoted aspect of MAI-Thinking-1 is how it was trained. Microsoft says the model was built "from the ground up on enterprise grade, clean and commercially licensed data, without distillation from third-party models," and that AI-generated content was excluded from pre-training. [1] Suleyman stressed that there was no distillation from other companies' models, including OpenAI's GPT series, a pitch aimed squarely at enterprises that care about clean data lineage and the legal provenance of the systems they deploy. [2][4]
Microsoft framed this as a question of trust and control rather than only quality. In the keynote the company argued that customers who train and tune on MAI models keep the benefits of their own workflows and control the resulting model, in contrast with building on a provider whose interests may diverge from theirs. [5] Microsoft also said the model "climbed entirely from the bottom, without specifically targeting benchmarks, and with zero distillation," a claim meant to distinguish genuine capability from benchmark optimization. [5] These are Microsoft's characterizations; as of the announcement the company had published technical materials, including a model card and a paper, but the results had not been independently reproduced by outside labs. [1][2]
Microsoft reported strong results on mathematics and competitive coding evaluations relative to the model's size. On the American Invitational Mathematics Examination, MAI-Thinking-1 scored 97.0 percent on AIME 2025 and 94.5 percent on AIME 2026. [1][3] On SWE-Bench Pro, a software-engineering benchmark, Microsoft reported a score around 53 percent and said the model matches Anthropic's Claude Opus 4.6 on coding tasks. [1][5] In blind side-by-side human evaluations run by Surge, an independent rating partner, raters preferred MAI-Thinking-1 over Claude Sonnet 4.6 for overall quality across single-turn and multi-turn tasks. [1][2]
| Benchmark | MAI-Thinking-1 | Note |
|---|---|---|
| AIME 2025 | 97.0% | Competition mathematics |
| AIME 2026 | 94.5% | Competition mathematics |
| SWE-Bench Pro | ~53% | Microsoft says it matches Claude Opus 4.6 |
| Human preference (Surge) | Preferred over Claude Sonnet 4.6 | Blind side-by-side, single and multi-turn |
Sources: Microsoft AI announcement, model page, and Build 2026 keynote. [1][3][5]
Microsoft presented the human-preference result as evidence that benchmark scores were translating into practical usefulness rather than narrow test performance. [2] The company also pointed to a customer-specific result: after tuning the model for the consulting firm McKinsey, Suleyman said it outperformed OpenAI's GPT-5.5 with roughly ten times better cost efficiency. [4] As with the other figures, these are vendor-reported numbers and should be read with the usual caution that applies to launch-day benchmarks.
The release matters less for any single benchmark than for what it signals about Microsoft's direction. Microsoft has spent years and many billions of dollars building products on OpenAI's models, and more recently it added Anthropic models, including Claude Opus 4.8, to its Foundry catalog. [2] MAI-Thinking-1 shows Microsoft trying to build a credible in-house alternative so it can rely less on suppliers whose commercial interests do not always align with its own. [2][4]
Cost is the other half of the pitch. Microsoft repeatedly described MAI-Thinking-1 as a high-efficiency, low-token-cost model and positioned it as the most cost-efficient frontier-class option in its tier, an argument aimed at budget-conscious enterprise buyers weighing per-token bills across providers. [3][10] The company tied the efficiency story to its own silicon, citing co-design with its Maia accelerators as part of a full-stack ownership strategy spanning chips, models, and tools. [5] If the model holds up under independent testing, it gives Microsoft leverage in pricing negotiations and a fallback should its relationships with frontier labs sour.
There is real skepticism to weigh against the launch claims. MAI-Thinking-1 is Microsoft's first reasoning model, its benchmark results have not been independently verified, and a single flagship release does not by itself replace the breadth of capabilities Microsoft sources from OpenAI and Anthropic. What is clear is the intent: Microsoft wants to own more of its AI stack, and a frontier reasoning model trained without anyone else's data is the most direct statement of that ambition it has made so far.
At announcement, MAI-Thinking-1 was available in private preview on Microsoft Foundry, the company's platform for integrating models into applications, with access by invitation. [1][3] Microsoft also said the model would be accessible through a MAI Playground for public preview, and that its MAI models would be distributed through third-party platforms including Fireworks AI, Baseten, and OpenRouter. [3][5] Microsoft did not publish standard per-token pricing for MAI-Thinking-1 at launch, though it emphasized low token cost relative to comparable frontier models. [3][10]