Microsoft Azure Maia 100

21 min read
Updated
Suggest editHistory
RawGraph

Last reviewed

Sources

No citations yet

Review status

Needs citations

Revision

v5 · 4,224 words

The Microsoft Azure Maia 100, often referred to as Maia 100, is a custom artificial intelligence accelerator designed by Microsoft for large-scale generative AI workloads running inside its Azure cloud. It was unveiled on November 15, 2023, during the opening day of Microsoft Ignite in Seattle, alongside an Arm-based server CPU called Cobalt 100.[1][2][3] The chip was Microsoft's first internally designed AI accelerator and the public-facing product of a multi-year effort codenamed "Project Athena" that had been under way inside the company since roughly 2019.[4][5]

Maia 100 was positioned as a vertically integrated piece of silicon, co-designed with OpenAI and built specifically to run workloads such as Microsoft Copilot, the Azure OpenAI Service, GitHub Copilot and Bing's generative features.[1][2][6] Manufactured on TSMC's 5-nanometer (N5) process with CoWoS-S advanced packaging, the chip packs 105 billion transistors on a reticle-sized die of roughly 820 mm², making it one of the largest production silicon devices on N5 at the time of its announcement.[1][7][8] It is paired with 64 GB of HBM2E memory delivering 1.8 TB/s of bandwidth, a design point that observers noted was conservative compared to the HBM3-equipped accelerators shipping from NVIDIA and AMD at the same time.[7][9][10]

Although Microsoft initially indicated that Maia 100 would begin rolling out in early 2024 to power Copilot, Azure OpenAI Service and other internal workloads, subsequent reporting in 2024 and 2025 indicated that the chip was never deployed at the volumes Microsoft had originally targeted. Multiple sources, including The Information and Tom's Hardware, reported that Maia 100 ended up being used primarily for internal staff training and validation rather than as a production AI accelerator for OpenAI's flagship models, and that the follow-on inference part, codenamed "Braga" and later released as the Maia 200, was delayed into 2026.[11][12][13] The Maia 200 was formally unveiled by Microsoft on January 26, 2026, with deployments beginning in the company's US Central region near Des Moines, Iowa.[14][15][16]

Background

Microsoft's silicon strategy and Project Athena

Microsoft's drive toward custom silicon dates back to the mid-2010s, when the company moved from buying turnkey datacenter equipment to designing its own servers, racks, and networking hardware in collaboration with the Open Compute Project.[2][17] In 2017 Microsoft began openly publishing custom server designs through OCP, and by 2019 it had begun work on what would become its first AI accelerator. The Information first reported the existence of the project, internally codenamed "Athena," in April 2023, several months before the public Ignite 2023 unveiling.[5][18]

The motivation for Athena was both economic and strategic. By 2023 Microsoft had become the world's largest external customer of NVIDIA H100 GPUs through its partnership with OpenAI, and the global H100 backlog stretched into 2024 with TSMC's CEO publicly warning that the GPU shortage could persist through 2025.[4][18] Microsoft was simultaneously paying for billions of dollars of capacity at neoclouds like CoreWeave and offering refunds for unused GPU reservations.[4] Owning a portion of the inference and training stack would, in principle, give Microsoft leverage over pricing, supply timing, and ultimately the per-token economics of its AI services.[9][11]

The chip was developed by Microsoft's Azure Hardware Systems and Infrastructure (AHSI) group under corporate vice president Rani Borkar, a former Intel executive who joined Microsoft in 2019 to lead its silicon teams.[2][5] Microsoft technical fellow Brian Harry led the engineering organization that designed the chip, and Pat Stemen served as a partner program manager for the broader AHSI effort.[2] By 2023 SemiAnalysis reported that Microsoft had spent on the order of two billion dollars on its silicon initiatives and employed close to a thousand engineers across its various chip teams, including former Apple chip architect Mike Filippo.[5]

Why "Maia"?

Microsoft has not published a formal etymology for the name "Maia." In Greco-Roman mythology, Maia is the eldest of the seven Pleiades (daughters of Atlas and Pleione) and the mother of Hermes; the name also denotes "midwife" or "mother" in archaic Greek and lends its name to the month of May.[19] The Maia 100 is the first product in a numbered series that continues with Maia 200 (codename "Braga"), Maia 200-Refresh ("Braga-R") and a planned 2027 successor known internally as "Clea."[13][16] The companion Arm CPU line is named for the cobalt-blue stars in the Pleiades cluster, reinforcing the celestial-naming theme.[2]

Position in Microsoft's chip portfolio

When Maia 100 launched in November 2023 it was one half of a coordinated two-chip announcement at Ignite. The companion product, the Azure Cobalt 100, was a 128-core Arm-based server CPU built on Arm's Neoverse Compute Subsystems (CSS) N2 platform, combining two 64-core Neoverse N2 tiles on a single TSMC 5-nm die running at 3.4 GHz.[20][21][22] Cobalt 100 entered preview in May 2024 and reached general availability in October 2024, by which time it had been deployed in 32 Azure regions to host workloads such as Microsoft Teams and Azure SQL Database.[21][22] At Ignite 2024 Microsoft expanded the family by announcing the Azure Boost DPU, its first in-house data processing unit, derived from technology acquired in the December 2022 acquisition of Fungible for approximately $190 million, and the Azure Integrated HSM, an in-house hardware security module.[23][24] Together, Maia, Cobalt, Boost and the HSM make up what Microsoft calls a "trifecta" of cloud silicon spanning AI compute, general-purpose compute, network/storage offload and confidential computing.[23]

Architecture

The most authoritative public technical description of Maia 100 came at the Hot Chips 2024 conference in August 2024, where partner SoC architect Sherry Xu and engineering lead Sairam Ramakrishnan presented a 17-slide deck titled "Inside Maia 100."[6][7][25] Microsoft followed up with a companion technical post on its Azure Infrastructure blog the same month.[6]

Process, packaging and die

Maia 100 is fabricated on TSMC's N5 (5-nanometer) process and uses TSMC's CoWoS-S 2.5D interposer packaging to integrate the main system-on-chip die with four stacks of HBM2E memory.[6][7][8] The SoC die is reticle-sized at approximately 820 mm², and the device contains 105 billion transistors, what SemiAnalysis described as "the highest transistor count monolithic die that has ever been disclosed publicly" at the time of unveiling.[5][7][8] The chip is provisioned at 500 watts in production, but is designed to support a thermal envelope of up to 700 watts.[7][8][25]

Compute hierarchy

Maia 100's SoC is organized into 16 clusters of four "tiles" each, for a total of 64 tiles per chip.[6][7] Every cluster includes a Cluster Control Processor (CCP) and a Cluster Data Movement Engine (CDMA) that manage access to the L2 cache.[7] Each tile contains:

  • a high-speed tensor unit configured as a 16×R×16 matrix engine, supporting a range of low-precision data types including the OCP MX (Microscaling) numerical formats with 9-bit (MX9) and 6-bit (MX6) variants in addition to BF16;[6][7][25]
  • a vector processor, described by Microsoft as a "loosely coupled superscalar engine" with a custom instruction-set architecture supporting BF16 and FP32 for higher-precision math;[6][7]
  • a direct-memory-access (DMA) engine that natively supports a variety of tensor-sharding schemes for distributed inference and training;[6]
  • hardware semaphores that allow asynchronous programming and overlapping of communication with compute.[6]

In aggregate the chip exposes 500 MB of software-managed L1/L2 scratchpad SRAM, among the largest on-die SRAM footprints disclosed for a contemporary AI accelerator.[7][9] Microsoft says the scratchpad is software-managed rather than hardware-cached, allowing the compiler to schedule data movement explicitly to avoid cache pollution during long-running matmuls.[6][7]

Performance and data types

Microsoft has been deliberately reticent about Maia 100's headline FLOPS figures. The Hot Chips presentation listed peak dense-tensor throughput in petaoperations per second (POPS) of 3 POPS at 6-bit precision, 1.5 POPS at 9-bit, and approximately 0.8 POPS at BF16.[25] Third-party reporters noted the chip's emphasis on the OCP Microscaling formats, which are co-developed under the Open Compute Project by Microsoft, Meta, NVIDIA, AMD, Arm, Intel and Qualcomm, and on a separate data-compression engine designed to reduce the bandwidth and capacity pressure that often bottlenecks large-language-model inference.[6][9][10][25] An onboard image-decoder block and confidential-compute features for tenant isolation are also disclosed.[25]

Memory subsystem

The accelerator pairs the SoC with four HBM2E stacks, providing 64 GB of high-bandwidth memory at an aggregate 1.8 TB/s.[1][7][8] Microsoft's decision to use HBM2E rather than the HBM3 or HBM3e that NVIDIA, AMD and Google were adopting at the same time has been widely commented on; observers describe it as a deliberately conservative choice intended to control cost and de-risk supply, but one that left Maia 100 capacity- and bandwidth-constrained relative to competing accelerators by the time it was deployed.[9][10][11] SemiAnalysis later wrote that "as expected of first generation silicon, Maia 100 was not manufactured in high volume or deployed for production workloads. The chip was architected before the Gen AI boom, leaving it short on memory bandwidth suitable for inference."[11]

Network fabric

Maia 100's most distinctive system-level feature is its networking. Rather than adopt NVIDIA's proprietary NVLink/InfiniBand stack, Microsoft built an Ethernet-based scale-up and scale-out fabric.[6][7][25] Each Maia 100 chip exposes 12 ports of 400 Gigabit Ethernet, yielding 4.8 Tb/s of aggregate per-chip bandwidth.[1][7][25] Microsoft uses a custom RoCE-like (RDMA over Converged Ethernet) protocol with AES-GCM encryption, supports confidential-compute traffic over the back-end fabric, and reports collective-operation bandwidths of up to 4,800 Gbps for all-gather/reduce-scatter and 1,200 Gbps for all-to-all on a single accelerator.[6][7][25] Three of the twelve Ethernet ports are typically used for intra-node communication between Maia chips on the same server tray, and the remaining nine connect to top-of-rack and aggregation switches in a multi-plane topology.[7]

This Ethernet-first design, paired with Microsoft's contributions to the Open Compute Project and its membership in the Ultra Ethernet Consortium, marked a clear strategic divergence from NVIDIA's vertically integrated NVLink + InfiniBand stack, and aligned with similar bets by Meta and Google.[7][9][25]

Software stack

Maia 100 ships with a dedicated software development kit, often called the Maia SDK, designed to make AI workloads portable from existing GPU code-bases.[6] The SDK exposes two programming models:

  • Triton, the open-source domain-specific language originally developed at OpenAI and now widely supported by AI accelerators, is positioned as the recommended path for agile, portable kernels;[6][26]
  • Maia API, a chip-specific custom programming model designed for kernel authors who need maximum control and performance.[6]

The SDK includes a first-class PyTorch backend that supports both eager and graph execution modes; integrations with ONNX Runtime; a kernel debugger, profiler and visualizer; model-quantization tooling; an inter-Maia communications library that maps NCCL-style collectives onto the Ethernet fabric; and an administration utility called maia-smi modeled on NVIDIA's nvidia-smi.[6] Microsoft says that simple PyTorch models can be retargeted to Maia with a single line of code.[6] Models destined for Azure OpenAI Service are typically traced into ONNX or compiled directly through the Triton path before being loaded onto Maia hardware.[6]

Form factor, rack and cooling

Maia 100 is not delivered as a drop-in PCIe card. Microsoft built an entirely new server and rack assembly to host the part because none of the company's existing datacenter racks could accommodate the chip's power, networking and cooling profile.[2][17][27]

The custom rack

The Maia rack is substantially wider than the standard 19-inch racks Microsoft uses elsewhere in Azure, in order to accommodate the dense networking cable harness and high-current power distribution that Maia 100's 500 W operating point and 12×400 GbE per chip require.[2][17][27] Each rack hosts custom-designed server boards containing multiple Maia 100 accelerators along with general-purpose CPU heads. Microsoft has declined to publish exact accelerator counts per rack for the Maia 100 generation, but in subsequent reporting on the follow-on Maia 200 system The Next Platform indicated that a coherent Maia 100 cluster domain consisted of 576 nodes hosting a total of 2,304 compute engines, implying four accelerators per node.[16]

The "sidekick"

Because Maia 100 is a liquid-cooled part operating at 500-700 watts and Microsoft's existing Azure datacenters were not built with the large facility-level chilled-liquid plant required for traditional direct-to-chip cooling, Microsoft engineered an in-aisle companion called the sidekick.[2][17][27] The sidekick is a self-contained cooling unit that sits adjacent to each Maia rack and operates analogously to an automotive radiator: cold liquid flows from the sidekick to cold plates affixed to each Maia 100 chip, the plates' microchannels absorb heat from the silicon, and the warmed liquid returns to the sidekick where it is recooled and recirculated.[2][17][27]

This closed-loop, rack-adjacent design allowed Microsoft to deploy Maia hardware into existing datacenters that lacked facility-scale liquid plant, a real constraint given the company's global footprint of pre-AI-era halls, and was later described by Microsoft as the basis for its broader move toward direct-to-chip and microfluidic cooling for future generations of silicon.[27][28]

Deployment and workloads

Stated intent at launch

When Microsoft unveiled Maia 100 on November 15, 2023, the company said the chip would begin "rolling out early next year" to Azure datacenters, where it would initially run Microsoft Copilot, Azure OpenAI Service, GitHub Copilot, Bing search and other internal AI workloads.[1][2][4] In a Bloomberg interview around the announcement, Rani Borkar said Microsoft was already testing Maia 100 against the Bing chatbot and GitHub Copilot, and against GPT-3.5-Turbo running on OpenAI's behalf, notably not GPT-4.[29]

OpenAI's involvement was central to Microsoft's framing of the chip. In the announcement materials, OpenAI CEO Sam Altman said: "Since first partnering with Microsoft, we've collaborated to co-design Azure's AI infrastructure at every layer for our models and unprecedented training needs… Azure's end-to-end AI architecture, now optimized down to the silicon with Maia, paves the way for training more capable models and making those models cheaper for our customers."[2] Microsoft executive vice president of Cloud and AI Scott Guthrie was quoted in parallel: "At scale, it's important for us to optimize and integrate every layer of the infrastructure stack to maximize performance, diversify our supply chain and give customers infrastructure choice."[1][2]

Actual deployment

Public information about Maia 100's production deployment is sparse, and by 2024 it had become clear that the chip was not deployed at the scale Microsoft had implied. Several reports converged on a picture in which Maia 100 served primarily as a learning vehicle for Microsoft's silicon and software teams rather than as a major production accelerator:

  • Tom's Hardware, citing reporting by The Information, wrote in June 2025 that Maia 100 "was designed for image processing rather than generative AI, and isn't powering any of the company's AI services, and has instead just been used internally for staff training purposes."[12]
  • DCD and Tom's Hardware separately reported that Microsoft had pushed the mass production of the next-generation Maia chip, codenamed Braga and intended to become the Maia 200, into 2026, at least six months behind the original 2025 target. The reports cited unanticipated design changes, including new features requested by OpenAI that destabilized the chip in simulation, plus high staff turnover on the chip-design teams.[12][13]
  • SemiAnalysis assessed that "as expected of first generation silicon, Maia 100 was not manufactured in high volume or deployed for production workloads," and noted that the chip was "architected before the Gen AI boom, leaving it short on memory bandwidth suitable for inference."[11]
  • Industry analysts at IT Pro and others noted that Maia 100 was "currently limited to internal Microsoft workloads rather than broad enterprise availability," with no Azure VM SKU offered to external customers.[9]

Microsoft itself has continued to point to Maia 100 as a deployed accelerator powering Copilot and Azure OpenAI workloads, but the company has not disclosed concrete deployment volumes, performance numbers against NVIDIA H100 or H200, or customer-facing Maia VM products, and external Maia 100 instances have not been generally available on Azure.[1][6][9][11]

Subsequent chips

In the wake of Maia 100's limited rollout, Microsoft built a multi-generation roadmap of inference-oriented successors:

  • Maia 200 (codename Braga): Microsoft's second-generation AI accelerator, built on TSMC's N3P (3-nanometer performance variant) with approximately 144 billion transistors on an 836 mm² die, 216 GB of HBM3e at 7 TB/s, 272 MB of on-die SRAM, native FP8 and FP4 tensor cores, and a 750 W thermal envelope. It was formally announced on January 26, 2026 and is being deployed initially in Microsoft's US Central region near Des Moines, Iowa, with US West 3 near Phoenix, Arizona to follow.[14][15][16] Microsoft claims 10.15 PFLOPS of FP4 throughput, 5.07 PFLOPS of FP8, 1.27 PFLOPS of BF16 on the vector engines, and "30% better performance per dollar than the latest generation hardware in [Microsoft's] fleet."[14][16] Notably, where Maia 100 supported both training and inference with the MX9/MX6 microscaling formats, Maia 200's tensor units do only FP4 and FP8, reflecting Microsoft's pivot to an inference-first focus.[16]
  • Braga-R: a planned refresh of Maia 200 targeted at 2026 deployment per a roadmap reported by DCD.[13]
  • Clea: a third-generation accelerator targeted at 2027 deployment in the same roadmap.[13]
  • Cobalt 200: a successor to the Cobalt 100 server CPU, expected to deliver roughly 50 percent more performance than Cobalt 100 and to act as the host processor in Maia 200 server blades.[16]

The Maia 200 announcement omitted any external Azure VM availability and was framed entirely around Microsoft's internal AI services such as OpenAI API inference, Microsoft Foundry, and the Office 365 family of copilots, a continuation of the internal-only deployment pattern set by Maia 100.[16]

Reception and commercial reality

Industry context

Maia 100 arrived in a market that had already produced multiple generations of hyperscaler-designed accelerators. Google's TPU line had been in production since 2015; AWS Trainium had reached general availability with the launch of its second-generation product in 2024; and Meta had begun publicly discussing its MTIA accelerator in 2023.[10][30] Maia 100 made Microsoft the last of the four major US hyperscalers (Amazon, Google, Meta, and Microsoft) to unveil a public-facing in-house AI accelerator.[5]

Reaction within the technical press was mixed. ServeTheHome, Tom's Hardware and The Next Platform praised the chip's Ethernet-first scale-up fabric, large software-managed SRAM, and aggressive adoption of the OCP microscaling number formats.[7][25][31] Less flattering coverage centered on Maia 100's HBM2E choice. TechRadar wrote that Microsoft "deliberately chose to use old tech for its NVIDIA GPU rival," noting that the chip's HBM2E memory capped it at 64 GB and 1.8 TB/s at a time when NVIDIA H100 was shipping with 80 GB of HBM3 at 3.35 TB/s and NVIDIA H200 would shortly arrive with HBM3e.[10]

Comparison with peer chips

Against the leading merchant and hyperscaler accelerators in 2024, Maia 100 occupied a distinct niche. Approximate, publicly-disclosed numbers are summarized below:

  • NVIDIA H100 (Hopper, 2022/23): 80 GB HBM3, 3.35 TB/s, ~989 BF16 TFLOPS dense (1979 with sparsity), proprietary NVLink + InfiniBand.[10][30]
  • NVIDIA H200 (Hopper-refresh, 2024): 141 GB HBM3e, 4.8 TB/s.[10]
  • AMD Instinct MI300X (2023/24): 192 GB HBM3, 5.3 TB/s, Infinity Fabric.[10]
  • Google TPU v5p / Ironwood: Inter-chip interconnect (ICI) optical fabric, hundreds of GB HBM, large pods (Ironwood announced November 2025).[30]
  • AWS Trainium 2 (GA 2024): Claimed 30-40 % better price-performance than H100-based p5e instances; integrated with NeuronSDK.[30]
  • Microsoft Maia 100 (2023): 64 GB HBM2E, 1.8 TB/s, 500 MB on-die SRAM, 12 × 400 GbE Ethernet (4.8 Tb/s), MX9/MX6/BF16/FP32, 500 W (700 W max).[7][9][10][25]

Maia 100's defining differentiator is therefore not raw tensor throughput, where it lagged H100 and MI300X, but its very large on-chip SRAM, its Ethernet-based fabric, and Microsoft's tight vertical integration of compiler, runtime and rack hardware.[7][9][25][31]

Commercial outcome

By the time Microsoft announced Maia 200 in January 2026, public assessments of Maia 100's commercial significance were uniformly muted. Bloomberg, Tom's Hardware and DCD wrote that Microsoft had effectively used Maia 100 as a first iteration to debug its silicon, packaging, networking and software pipeline rather than as a major production accelerator, and was placing its real volume bet on Maia 200 (which DIGITIMES reported was expected to ship at more than ten times the volume of Maia 100).[11][12][13][15] In parallel, Microsoft expanded rather than contracted its commitments to NVIDIA and AMD: at Ignite 2023 the company simultaneously announced new NC H100 v5 and ND H200 v5 GPU instances and the ND MI300 v5 series based on the AMD Instinct MI300X, and it continued to be one of NVIDIA's largest customers through 2025 and 2026.[1][2][32]

The Maia 100 launch nevertheless had two lasting consequences. First, it established a durable Microsoft AI silicon team and product cadence: by 2026 the company had a multi-year roadmap covering Maia 200, Braga-R and Clea on the AI side and Cobalt 100 and Cobalt 200 on the CPU side, supplemented by the Azure Boost DPU and Azure Integrated HSM.[13][16][23] Second, it cemented Microsoft's strategic choice of an Ethernet-based, OCP-aligned, open-standards scale-up fabric in opposition to NVIDIA's vertically integrated NVLink + InfiniBand stack, a choice subsequently doubled down on in Maia 200's "AI Transport Layer" Ethernet design and in Microsoft's leadership role in the Ultra Ethernet Consortium.[6][7][16][25]

In that sense, even though the chip itself never powered OpenAI's flagship models at scale, Maia 100's most important legacy may lie not in the silicon it shipped but in the platform (racks, sidekicks, compilers, ONNX integration and Ethernet fabric) that it forced Microsoft to design and that succeeding generations of Azure AI silicon now inherit.

See also

References

  1. Tom Warren and Jay Peters. "Microsoft is finally making custom chips, and they're all about AI." The Verge, November 15, 2023. https://www.theverge.com/2023/11/15/23960345/microsoft-cpu-gpu-ai-chips-azure-maia-cobalt-specifications-cloud-infrastructure
  2. John Roach. "With a systems approach to chips, Microsoft aims to tailor everything 'from silicon to service' to meet AI demand." Microsoft Source, November 15, 2023. https://news.microsoft.com/source/features/ai/in-house-chips-silicon-to-service-to-meet-ai-demand/
  3. Microsoft Azure. "Azure Maia for the era of AI: From silicon to software to systems." Microsoft Azure Blog, November 15, 2023. https://azure.microsoft.com/en-us/blog/azure-maia-for-the-era-of-ai-from-silicon-to-software-to-systems/
  4. Kyle Wiggers. "Microsoft looks to free itself from GPU shackles by designing custom AI chips." TechCrunch, November 15, 2023. https://techcrunch.com/2023/11/15/microsoft-looks-to-free-itself-from-gpu-shackles-by-designing-custom-ai-chips/
  5. Dylan Patel and Daniel Nishball. "Microsoft Infrastructure - AI & CPU Custom Silicon Maia 100, Athena, Cobalt 100." SemiAnalysis, November 15, 2023. https://newsletter.semianalysis.com/p/microsoft-infrastructure-ai-and-cpu
  6. Microsoft Azure Infrastructure Blog. "Inside Maia 100: Revolutionizing AI Workloads with Microsoft's Custom AI Accelerator." Microsoft Tech Community, August 2024. https://techcommunity.microsoft.com/blog/azureinfrastructureblog/inside-maia-100-revolutionizing-ai-workloads-with-microsofts-custom-ai-accelerat/4229118
  7. Patrick Kennedy. "Microsoft MAIA 100 AI Accelerator for Azure." ServeTheHome, August 2024. https://www.servethehome.com/microsoft-maia-100-ai-accelerator-for-azure/
  8. Anthony Garreffa. "Microsoft lifts the lid on its new AI chip, Maia 100, up to 700W TDP, built for large-scale AI." TweakTown, August 28, 2024. https://www.tweaktown.com/news/100264/microsoft-lifts-the-lid-on-its-new-ai-chip-maia-100-up-to-700w-tdp-built-for-large-scale/index.html
  9. Solomon Klappholz. "What is Microsoft Maia?" IT Pro, 2024-2025. https://www.itpro.com/infrastructure/what-is-microsoft-maia
  10. Wayne Williams. "Microsoft deliberately chose to use old tech for its Nvidia GPU rival: Maia 100 AI accelerator uses HBM2E memory and the mysterious ability to 'unlock new capabilities' via firmware update." TechRadar, 2024. https://www.techradar.com/pro/microsoft-deliberately-chose-to-use-old-tech-for-its-nvidia-gpu-rival-maia-100-ai-accelerator-uses-hbm2e-memory-and-the-mysterious-ability-to-unlock-new-capabilities-via-firmware-update
  11. Dylan Patel et al. "Microsoft's AI Strategy Deconstructed - from Energy to Tokens." SemiAnalysis, 2025. https://newsletter.semianalysis.com/p/microsofts-ai-strategy-deconstructed
  12. Anton Shilov. "Microsoft's own AI chip delayed six months in major setback: in-house chip now reportedly expected in 2026, but won't hold a candle to Nvidia Blackwell." Tom's Hardware, June 2025. https://www.tomshardware.com/tech-industry/semiconductors/microsofts-own-ai-chip-delayed-six-months-in-major-setback-in-house-chip-now-reportedly-expected-in-2026-but-wont-hold-a-candle-to-nvidia-blackwell
  13. Sebastian Moss. "Microsoft delays production of Maia AI chip to 2026 - report." DatacenterDynamics, June 30, 2025. https://www.datacenterdynamics.com/en/news/microsoft-delays-production-of-maia-100-ai-chip-to-2026-report/
  14. Microsoft. "Maia 200: The AI accelerator built for inference." Microsoft Blog, January 26, 2026. https://blogs.microsoft.com/blog/2026/01/26/maia-200-the-ai-accelerator-built-for-inference/
  15. Maria Deutscher. "Microsoft's next-gen Maia 200 chip promises massive performance boost for AI workloads." SiliconANGLE, January 26, 2026. https://siliconangle.com/2026/01/26/microsofts-next-gen-maia-200-chip-promises-massive-performance-boost-ai-workloads/
  16. Timothy Prickett Morgan. "Microsoft Takes On Other Clouds With 'Braga' Maia 200 AI Compute Engines." The Next Platform, January 28, 2026. https://www.nextplatform.com/ai/2026/01/28/microsoft-takes-on-other-clouds-with-braga-maia-200-ai-compute-engines/4092134
  17. Rich Miller. "Microsoft Unveils Custom-Designed Data Center AI Chips, Racks and Liquid Cooling." Data Center Frontier, November 15, 2023. https://www.datacenterfrontier.com/machine-learning/article/33015123/microsoft-unveils-custom-designed-data-center-ai-chips-racks-and-liquid-cooling
  18. Anton Shilov. "Microsoft Building Its Own AI Chip on TSMC's 5nm Process." Tom's Hardware. https://www.tomshardware.com/news/microsoft-athena-ai-chip-tsmc
  19. Wikipedia. "Maia." https://en.wikipedia.org/wiki/Maia
  20. Patrick Kennedy. "Microsoft Azure Cobalt 100 128 Core Arm Neoverse N2 CPU Launched." ServeTheHome, November 15, 2023. https://www.servethehome.com/microsoft-azure-cobalt-100-128-core-arm-neoverse-n2-cpu-launched/
  21. Tobias Mann. "Microsoft Arm-based Cobalt 100 CPU VMs in Azure preview." The Register, May 22, 2024. https://www.theregister.com/2024/05/22/microsofts_armbased_cobalt_100_cpu/
  22. Tobias Mann. "Microsoft's Arm-based Cobalt 100 CPU VMs go live in Azure." The Register, October 21, 2024. https://www.theregister.com/2024/10/21/microsoft_arm_cobalt_100_cpu/
  23. Sebastian Moss. "Microsoft announces in-house Arm CPU and AI accelerator chips, custom racks." DatacenterDynamics, November 15, 2023. https://www.datacenterdynamics.com/en/news/microsoft-announces-in-house-arm-cpu-and-ai-accelerator-chips-custom-racks/
  24. Chris Mellor. "Microsoft bolsters Azure infra with Fungible-derived DPU." Blocks & Files, November 20, 2024. https://blocksandfiles.com/2024/11/20/microsoft-boosts-azure-infrastructure-with-fungible-derived-dpu/
  25. Sherry Xu and Sairam Ramakrishnan. "Inside Maia 100." Hot Chips 2024 conference presentation, August 2024. https://hc2024.hotchips.org/assets/program/conference/day2/81_HC2024.Microsoft.Xu.Ramakrishnan.final.v2.pdf
  26. Sayan Sen. "Microsoft shares more details on Maia 100, its first custom AI chip." Neowin, 2024. https://www.neowin.net/news/microsoft-shares-more-details-on-maia-100-its-first-custom-ai-chip/
  27. Maginative. "Microsoft Unveils Azure Maia 100 and Cobalt 100 Chips: Custom Silicon for AI and Cloud Workloads." November 15, 2023. https://www.maginative.com/article/microsoft-unveils-azure-maia-100-and-cobalt-100-chips-custom-silicon-for-ai-and-cloud-workloads/
  28. Sebastian Moss. "Microsoft adopting direct-to-chip liquid cooling, exploring microfluidics." DatacenterDynamics, 2024. https://www.datacenterdynamics.com/en/news/microsoft-adopting-direct-to-chip-liquid-cooling-exploring-microfluidics/
  29. Dina Bass. "Microsoft Unveils Its First Custom-Designed AI, Cloud Chips." Bloomberg, November 15, 2023. https://www.bloomberg.com/news/articles/2023-11-15/microsoft-unveils-its-first-custom-designed-ai-cloud-chips
  30. Kif Leswing. "Nvidia Blackwell, Google TPUs, AWS Trainium: Comparing top AI chips." CNBC, November 21, 2025. https://www.cnbc.com/2025/11/21/nvidia-gpus-google-tpus-aws-trainium-comparing-the-top-ai-chips.html
  31. Paul Alcorn. "Microsoft Reveals Custom 128-Core Arm Datacenter CPU, Massive Maia 100 GPU Designed for AY AI." Tom's Hardware, November 15, 2023. https://www.tomshardware.com/news/microsoft-azure-maia-ai-accelerator-cobalt-cpu-custom
  32. Anton Shilov. "Microsoft introduces newest in-house AI chip: Maia 200 is faster than other bespoke Nvidia competitors, built on TSMC 3nm with 216GB of HBM3e." Tom's Hardware, January 2026. https://www.tomshardware.com/pc-components/cpus/microsoft-introduces-newest-in-house-ai-chip-maia-200-is-faster-than-other-bespoke-nvidia-competitors-built-on-tsmc-3nm-with-216gb-of-hbm3e

Improve this article

Add missing citations, update stale details, or suggest a clearer explanation.

Suggest edit