Meta MTIA 300 series
Last reviewed
Jun 3, 2026
Sources
9 citations
Review status
Source-backed
Revision
v1 · 1,817 words
Improve this article
Add missing citations, update stale details, or suggest a clearer explanation.
Last reviewed
Jun 3, 2026
Sources
9 citations
Review status
Source-backed
Revision
v1 · 1,817 words
Add missing citations, update stale details, or suggest a clearer explanation.
The Meta MTIA 300 series is a set of four custom AI accelerator chips that Meta unveiled on March 11, 2026, as the next phase of its in-house Meta Training and Inference Accelerator program. The four chips, named MTIA 300, MTIA 400, MTIA 450, and MTIA 500, are designed to roll out over roughly two years on a release schedule Meta describes as "every six months or less." The MTIA 300 is already in production for the ranking and recommendation models that order content in Facebook and Instagram, while the 400, 450, and 500 extend the line toward generative-AI inference through 2027.[1][2] All four are built in partnership with Broadcom and are designed to drop into the same modular racks, which follow Open Compute Project standards.[2][3]
The announcement came a few weeks after Meta signed large multiyear deals to buy chips from Nvidia and AMD, and it fits a broader strategy in which Meta sources silicon from several outside vendors while keeping its own MTIA designs at the center of its AI infrastructure.[2][4]
MTIA is a family of application-specific integrated circuits that Meta designs to run its largest machine learning workloads in its own data centers. The program began publicly in 2023 with MTIA v1, an inference chip aimed at the deep learning recommendation models that decide which posts, reels, and ads appear in users' feeds, followed by a second-generation part in 2024.[5] These ranking and recommendation models, often shortened to R&R, run constantly at a scale of billions of decisions a day, and they lean heavily on memory access and data movement rather than pure arithmetic. For a workload that large, the numbers that matter most are performance per watt and total cost of ownership.[2][5]
Designing the chip and its compiler together lets Meta target the exact shape of its own models, wring more useful work out of each watt, and keep more control over its hardware roadmap. It also reduces how much the company depends on any single outside supplier. Meta has consistently framed MTIA as a complement to the GPUs it buys, not a wholesale replacement, and it continues to purchase very large quantities of merchant silicon for training and other jobs.[2][4]
Meta presented the 300 series as a single roadmap in which each chip evolves out of the one before it, reusing common building blocks rather than starting over. The company said this lineage is what lets it ship a new generation roughly every six months instead of the one-to-two-year cadence common across the industry.[1][2]
| Generation | Primary role | Status and timing |
|---|---|---|
| MTIA 300 | Ranking and recommendation training | In production as of March 2026 |
| MTIA 400 | GenAI plus continued R&R support | In lab testing, headed for data center deployment |
| MTIA 450 | GenAI inference (optimized) | Mass deployment targeted for early 2027 |
| MTIA 500 | GenAI inference (efficiency focused) | Deployment planned for 2027 |
Sources: Meta newsroom and engineering blogs, plus independent reporting.[1][2][6]
The MTIA 300 was optimized for R&R models and is the chip Meta is running in production today for that training work. Meta describes it as the foundation for the rest of the line, with built-in network interface chiplets and dedicated message engines to move data between accelerators.[2][6] As demand for generative AI grew, Meta says the MTIA 300 design evolved into the MTIA 400, which keeps the R&R capability but adds support for GenAI models. The 400 introduces a 72-accelerator scale-up domain, meaning 72 chips can be wired together in a rack to act as one large pool, and Meta describes its raw performance as competitive with leading commercial products.[2][6]
The MTIA 450 and MTIA 500 are aimed squarely at running generative-AI models in production, which Meta calls inference. The 450 is the inference-optimized step, and the 500 follows as a more efficiency-focused part. Both are scheduled for mass deployment in 2027, with the 450 arriving first in early 2027.[2][6]
Meta described the gains mostly as relative jumps from one generation to the next rather than a full datasheet. Across the whole progression from the 300 to the 500, the company said memory bandwidth from high bandwidth memory (HBM) rises about 4.5 times and compute throughput in FLOPS rises about 25 times.[2][6]
Step by step, Meta said the MTIA 400 delivers roughly 400 percent more FP8 FLOPS and about 51 percent more HBM bandwidth than the 300. The MTIA 450 doubles HBM bandwidth again versus the 400 and offers about six times the MX4 FLOPS, while the MTIA 500 adds another 50 percent of HBM bandwidth, up to 80 percent more HBM capacity, and about 43 percent more MX4 FLOPS on top of the 450.[2][6] Meta claimed the 450's HBM bandwidth exceeds that of existing leading commercial accelerators, a comparison reporters read against Nvidia parts such as the H100 and H200.[6]
Reporting by Tom's Hardware listed approximate per-chip figures that fill in the picture, though Meta did not publish a complete specification sheet. By that account the MTIA 300 draws about 800 watts with roughly 6.1 TB/s of HBM bandwidth and 216 GB of capacity, the 400 about 1,200 watts and 9.2 TB/s, the 450 about 1,400 watts and 18.4 TB/s, and the 500 about 1,700 watts and 27.6 TB/s with up to 384 to 512 GB of HBM.[6] These power and memory numbers come from press coverage rather than a Meta datasheet, so they are best read as reported rather than official. The chips are built from reusable chiplets: The Register described the 300 as using one compute chiplet with network chiplets, the 400 stepping up to two compute chiplets, and the 500 using a two-by-two compute chiplet arrangement plus a separate chiplet for PCIe connectivity.[7] Meta and Broadcom have not disclosed a manufacturing process node.[6][7]
The headline of the announcement was less any single chip than the speed of the roadmap. Meta attributes the six-month cadence to reuse at every level of the design: the chiplets that make up each chip, and the chassis, racks, and network that house them.[1][2] Because each new generation reuses the same chassis, rack, and networking, a fresh chip can drop into infrastructure that is already deployed, which shortens the gap between taping out silicon and running it in production.[2]
That infrastructure is built to Open Compute Project standards, the open hardware effort Meta helped start. Meta said its MTIA system and rack solutions align with OCP so the chips can be deployed across its data centers without bespoke plumbing for each generation.[2][3] The software side leans on widely used open tools rather than a proprietary stack, namely PyTorch, the vLLM inference server, and the Triton kernel language, alongside the OCP hardware standards.[2] Building on common ecosystems makes it easier to move models between MTIA and merchant GPUs inside the same fleet.[2][5]
The 300 series landed in the middle of a spending spree. In February 2026 Meta deepened a multiyear deal with Nvidia covering millions of chips, including Blackwell and the upcoming Rubin generation and, for the first time, standalone Grace and Vera CPUs aimed at inference. Days later it signed a roughly 6-gigawatt agreement with AMD spanning several generations of Instinct GPUs, starting with a custom part based on the MI450 architecture.[4][8] Meta has described this as a portfolio approach: It sources silicon from a range of vendors to scale capacity quickly while keeping its own MTIA chips at the core of the strategy.[2][4] CEO Mark Zuckerberg has said the AMD hardware will go mainly toward inference and what Meta calls personal superintelligence work.[8]
Read against that backdrop, the MTIA 300 series is Meta's bid to handle more of its own inference in-house, especially the steady, well-understood serving of R&R and generative models, rather than buying every accelerator from outside. Broadcom, the development partner, indicated Meta plans to install multiple gigawatts of MTIA chips in 2027 and beyond, which signals that Meta intends the custom line to carry a meaningful share of its fleet rather than a token amount.[7]
Meta is not alone in this. Google has shipped successive generations of its TPU since the mid-2010s, Amazon offers Trainium and Inferentia, and Microsoft introduced its Maia accelerator. The motivations rhyme across all of them: better efficiency on their own workloads, tighter control over the hardware and software, and less exposure to the price and supply of any one vendor. Tom's Hardware framed the MTIA lineup as part of a unified push by hyperscalers toward dedicated inference silicon.[9] What sets MTIA apart is how tightly it was first shaped around recommendation models and the memory-heavy access patterns they create, and how aggressively Meta is now iterating it toward generative AI.[2][5]
Much of what is public about the 300 series is comparative or reported rather than a full disclosure. Meta gave relative performance jumps and a deployment timeline but no complete datasheet, no process node, and few absolute figures, so the most concrete per-chip numbers come from press coverage and should be treated as such.[6][7] The 450 and 500 are also still in the future as production parts, with mass deployment targeted for 2027, which means their real-world performance and how much GPU buying they offset remain to be seen.[2][6] The claims that certain MTIA chips beat leading commercial accelerators are Meta's own and were not independently benchmarked at announcement.[6][7] Whether the six-month cadence holds across all four generations, and how much of Meta's inference the custom line ultimately absorbs, will determine how large a role the 300 series plays in the company's infrastructure.[2][9]