Fractile
Last reviewed
Jun 3, 2026
Sources
16 citations
Review status
Source-backed
Revision
v1 · 1,627 words
Improve this article
Add missing citations, update stale details, or suggest a clearer explanation.
Last reviewed
Jun 3, 2026
Sources
16 citations
Review status
Source-backed
Revision
v1 · 1,627 words
Add missing citations, update stale details, or suggest a clearer explanation.
Fractile is a British AI hardware startup, based in London, that is building in-memory computing chips to speed up and cut the cost of running large AI models. The company was founded in 2022 by Dr. Walter Goodwin, a robotics and machine-learning PhD from the University of Oxford. Its pitch is narrow and specific: the thing slowing AI down is no longer training the models but serving them, and the bottleneck in serving them is memory. Fractile designs accelerators that perform the arithmetic of AI inference inside the memory that holds a model's weights, rather than shuttling those weights back and forth to a separate chip. In May 2026 it raised a $220 million round, one of the larger European semiconductor financings of the year, to take its first chip through tape-out toward a planned 2027 launch. [1][2][3]
Modern transformer models are dominated by matrix multiplication, and during inference the chip has to read the model's billions of parameters from memory for every token it generates. On a GPU, those parameters live in high-bandwidth memory (HBM) sitting next to the processor, and moving them across that gap eats most of the time and most of the energy. Engineers call this the "memory wall": compute throughput has grown far faster than the bandwidth feeding it, so for single-stream inference the processor spends much of its time waiting on data rather than calculating. [4][5]
Goodwin has framed the consequence in business terms. "Inference is both the revenue engine of the AI industry and the rate-limiting factor on expanding it," he told reporters around the 2026 round. As models are pushed to tackle longer, harder tasks, a single answer can run to tens of millions of tokens. At a typical generation rate of roughly 40 tokens per second, Fractile points out, an output of around 100 million tokens would take about a month to finish. The company's argument is that pulling that down to something usable, on the order of a thousand tokens per second, is what makes ambitious agentic and research workloads economically viable at all. [5][6]
Fractile's approach is a form of compute-in-memory, which it has branded "memory-compute fusion." Instead of separating storage and arithmetic, the design stores a model's weights in on-chip SRAM and performs the matrix multiplications inside the same memory cells that hold the data, alongside the compute logic. Because the parameters never have to be fetched from off-chip DRAM, the company says it removes the energy and latency penalty that defines GPU inference. Fractile has been explicit that it does not rely on traditional high-bandwidth memory and that it co-locates compute with SRAM on the die. [1][6][7]
Note that reporting on the exact memory story has not been fully consistent. Most coverage, including Tom's Hardware and The Next Web, describes a DRAM-less, SRAM-based in-memory design, which matches Fractile's own framing; at least one account muddled this point. What the company has consistently claimed since exiting stealth is a circuit-level result: that its novel circuits can execute about 99.99% of the operations needed to run model inference, and that the architecture targets roughly 20 times the operations-per-watt of anything else it can see. [1][7][8]
The headline performance numbers should be read as engineering targets rather than benchmarked production results. At its 2024 stealth launch the company said it was aiming for inference that is 100 times faster and 10 times cheaper than current GPU setups. More recent investor-facing materials have framed the goal as roughly 25 times faster at one-tenth the cost. Fractile has disclosed simulation and small-silicon data rather than full benchmarks against deployed GPU clusters, and the technology is still in development. [7][8]
Fractile came out of stealth in July 2024 with a $15 million (about £12 million) seed round, co-led by Kindred Capital, the NATO Innovation Fund, and Oxford Science Enterprises, with participation from Cocoa and Inovia Capital. Its angel backers read like a roster of British chip history: Hermann Hauser, co-founder of Acorn Computers and the lineage that produced Arm; Stan Boland, a veteran of Acorn, Icera, NVIDIA, and the self-driving company Five AI; and Amar Shah, co-founder of Wayve. [8][9]
The much larger round came in May 2026. Most sources, including Verdict and TechMonitor, describe it as a $220 million Series B announced on 13 May 2026, led by Accel, Factorial Funds, and Founders Fund, with participation from Conviction, Gigascale, O1A, Felicis, Buckley Ventures, 8VC, and existing investors. Several outlets reported a post-money valuation of roughly $1 billion, though Fractile did not officially confirm a figure. Former Intel chief executive Pat Gelsinger joined as an angel investor and operating adviser. A handful of brief reports referred only to a "$220m round" without a series label, and earlier in 2026 the company had been reported to be raising around $200 million, so the final size came in above the initial target. [2][3][10][11]
| Round | Date | Amount | Lead investors |
|---|---|---|---|
| Seed | July 2024 | $15M (about £12M) | Kindred Capital, NATO Innovation Fund, Oxford Science Enterprises |
| Series B | May 2026 | $220M | Accel, Factorial Funds, Founders Fund |
Between the two equity rounds Fractile also announced, in February 2026, a £100 million (about $135 million) plan to expand its UK operations over three years. That money is going into its existing London and Bristol sites and a new hardware engineering centre in Bristol, where engineers will turn the chips into full AI systems and run a lab for testing software against future compute. At the time of that announcement the company employed about 80 people. The UK government promoted the expansion as evidence that homegrown firms could reduce Britain's dependence on foreign chipmakers. [12][13]
Fractile says it works across several layers of the stack at once: AI research, chip microarchitecture, and foundry process innovation. Its first commercial chip is not expected to be available until 2027, and management has said the $220 million is sized to carry the design through tape-out, building out the software stack, and early customer integration rather than a full production ramp. Alongside London and Bristol, the company has built out offices in San Francisco and Taipei, the latter putting it close to the foundry and packaging supply chain in Taiwan. Its engineering team includes people drawn from Graphcore, NVIDIA, and Imagination Technologies. [3][5][6]
A notable signal of demand arrived before the chip even exists. In May 2026, reporting indicated that Anthropic was in early discussions to buy Fractile's accelerators once they ship. If that relationship formalises, Fractile would become Anthropic's fourth named compute supplier, alongside NVIDIA GPUs, Google's TPUs, and Amazon's Trainium and Inferentia parts. The interest comes against a backdrop of high DRAM prices and supply constraints, which makes a DRAM-light architecture more attractive on cost grounds. [6][14]
Fractile is entering a crowded field of inference-focused challengers, all of them trying to beat NVIDIA at the specific job of generating tokens cheaply rather than at training. Groq and Cerebras pioneered SRAM-heavy designs that keep weights on-chip, and both have leaned on that approach for low-latency inference. The catch, which the broader industry has run into, is that frontier models eventually grow too large to fit in on-chip SRAM, so companies like Groq deploy many chips in parallel and others add tiers of external memory. d-Matrix, for instance, uses a second tier of LPDDR to reach hundreds of gigabytes per card. [15][16]
Fractile's bet is that fusing compute and memory more tightly than these rivals, while staying off HBM, gives it a structural advantage on the energy and latency tax of inference. Whether that holds up depends on questions it has not yet answered in public: how it fits large models across many chips, how mature its compiler and software stack are, and whether the simulated numbers survive contact with real silicon and real production models. The 2027 timeline means NVIDIA, Groq, Cerebras, SambaNova, and transformer-specific entrants like Etched will all have moved on by the time Fractile's first chip is in customers' hands. For now it is a well-funded, technically distinctive moonshot with a credible founding team and a marquee potential customer, but no shipping product. [11][15][16]