# Untether AI

> Source: https://aiwiki.ai/wiki/untether_ai
> Updated: 2026-06-07
> Categories: AI Companies, AI Hardware
> From AI Wiki (https://aiwiki.ai), a free encyclopedia of artificial intelligence. Quote with attribution.

# Untether AI

**Untether AI Corp.** was a Toronto-based fabless semiconductor startup that designed energy-efficient inference accelerators based on a proprietary "at-memory compute" architecture, in which large arrays of small processing elements are placed directly inside on-chip SRAM banks to minimize the energy cost of moving weights between memory and arithmetic units.[^1][^2] Founded in 2018 by Martin Snelgrove, Darrick Wiebe, and Raymond Chik with seed funding led by Intel Capital, the company built two generations of inference silicon: the runAI200 (TSMC 16 nm, INT8, shipping inside the tsunAImi tsn200 PCIe card) and the speedAI240 (TSMC 7 nm, FP8, codenamed "Boqueria"), the latter delivering roughly 2 PetaFLOPS of FP8 throughput and approximately 30 TFLOPS per watt at sample stage.[^3][^4] Untether AI raised approximately US$152 million in cumulative equity funding from investors including Intel Capital, GM Ventures, Tracker Capital Management, Canada Pension Plan Investment Board (CPPIB), and Radical Ventures.[^5][^6] In June 2025 the company shut down its hardware business: [AMD](/wiki/nvidia)'s competitor [AMD](/wiki/amd_instinct_mi300x) hired the engineering team and licensed selected intellectual property in an acquihire-style strategic transaction, while the speedAI processors and the imAIgine software development kit (SDK) were discontinued and Untether's separate corporate entity wound down sales and support.[^7][^8][^9] The shutdown was widely covered as one of the more visible failures of the 2018-2022 generation of AI-inference silicon startups, and the deal followed a similar pattern AMD had used a week earlier with Brium and Enosemi.[^8][^10]

## Infobox

| Field | Value |
| --- | --- |
| Company | Untether AI Corp. |
| Headquarters | Toronto, Ontario, Canada |
| Founded | 2018 |
| Founders | Martin Snelgrove (CTO), Darrick Wiebe, Raymond Chik |
| CEOs | Arun Iyengar (2019-2024); Chris Walker (Jan 2024-May 2025) |
| Total funding | approximately US$152 million across five rounds[^5][^6] |
| Key investors | Intel Capital, GM Ventures, Tracker Capital Management, CPP Investments, Radical Ventures[^5][^6] |
| Generation 1 chip | runAI200 (TSMC 16 nm, 502 INT8 TOPS, 200+ MB SRAM)[^3][^11] |
| Generation 1 card | tsunAImi tsn200 (4 runAI200, 2 PetaOPS INT8)[^11][^12] |
| Generation 2 chip | speedAI240 / "Boqueria" (TSMC 7 nm, 2 PFLOPS FP8, 238 MB SRAM, 1,458 RISC-V cores)[^2][^4] |
| Generation 2 card | speedAI240 Slim (October 2024 broad availability)[^13] |
| Software | imAIgine SDK (compiler + simulator + runtime)[^14][^15] |
| Shutdown | June 5, 2025: engineering team acquihired by [AMD](/wiki/amd_instinct_mi300x); products discontinued[^7][^8][^9] |

## Background and founding

Untether AI was founded in early 2018 in Toronto by Martin Snelgrove, a long-time analog and digital circuits researcher and former professor at Carleton University and the University of Toronto; Darrick Wiebe, a software architect; and Raymond Chik, an analog and mixed-signal circuit designer.[^16][^17] The trio aimed to build a domain-specific processor for neural network inference whose key constraint was energy spent on data movement rather than peak arithmetic throughput. Snelgrove later articulated the motivation publicly at Hot Chips 2022: in a typical [graphics processing unit](/wiki/gpu) or general-purpose accelerator, the dominant share of energy in convolutional and transformer inference workloads is consumed not in multiplying weights and activations but in moving those weights from off-chip [datacenter](/wiki/ai_datacenter)-scale DRAM into on-chip caches and registers, with later analyses citing roughly 90% of total energy spent on data movement.[^4][^16]

The company emerged from stealth in April 2019 with an announcement that it had raised approximately C$13 million in seed financing led by Intel Capital, with participation from GM Ventures, Radical Ventures and angel investors.[^17][^18] In a separate add-on later in 2019, Radical Ventures and Intel Capital added a further US$7 million, lifting Untether's Series A to approximately US$20 million; the Series A round was officially announced on November 5, 2019, together with the appointment of Arun Iyengar (a former Altera, AMD, and Xilinx executive) as Untether's first CEO, with Snelgrove transitioning into the CTO role.[^16][^18][^19] At the time of the Series A announcement Untether described itself as having built a working prototype of its at-memory architecture in under five months following an initial seed investment.[^16][^18] The early board added Tomi Poutanen of Radical Ventures.[^16][^18]

Untether AI's second-generation funding milestone arrived on July 20, 2021, when the company announced an oversubscribed US$125 million Series B led by Tracker Capital Management and Intel Capital, with new investor CPP Investments and prior investor Radical Ventures participating. Untether had originally targeted US$100 million but accepted a 25% oversubscription. Dr. Shaygan Kheradpir of Tracker Capital joined the board as part of the round.[^5][^20] The Series B brought cumulative announced funding into the same range as several other independent inference-silicon startups of the period, including [Tenstorrent](/wiki/tenstorrent), [SambaNova Systems](/wiki/sambanova), and [Groq](/wiki/groq_hardware).

In September 2023, Untether announced that former Intel corporate vice president and general manager Chris Walker had joined as president; he had previously led Intel's Mobile Client Platforms organization through Project Athena and the Intel Evo platform.[^21][^22] In January 2024, the company elevated Walker to CEO, while Arun Iyengar transitioned out of the role after roughly four years. Dr. Amir Salek, a former head of silicon for Google Technical Infrastructure and Google Cloud, joined the board.[^21][^22] EE Times described the leadership turnover as occurring at "a transition point" in the company's growth from chip development to commercial deployment.[^22]

## The at-memory compute architecture

Untether AI's central architectural claim was that placing weight storage adjacent to compute elements inside SRAM cells (rather than streaming weights from external DRAM or from large shared on-chip caches) could reduce the energy cost of inference by an order of magnitude versus conventional [graphics processing units](/wiki/gpu) and [GPU-style accelerators](/wiki/gpu_computing).[^2][^16] The company labelled this approach "at-memory compute" to distinguish it from "near-memory" designs (which keep memory physically close to compute but still move operands), and from "in-memory compute" approaches that perform analog multiply-accumulate operations inside memory arrays themselves, an approach pursued by competitors such as Mythic and EnCharge AI.[^2][^4] By contrast, Untether stored weights in standard digital SRAM cells and placed digital arithmetic units immediately beside each bank of SRAM so that weights never needed to traverse a long bus to reach the multipliers.[^2][^4]

### Memory bank as fundamental unit

The core building block of Untether's architecture is a "memory bank." In the first-generation runAI200 design, each bank contained approximately 385 kilobytes of SRAM organized as a two-dimensional array of 512 small processing elements (PEs), each holding its own portion of weights and operating in a single-instruction multiple-data (SIMD) fashion under the control of a per-bank RISC-V processor. Each runAI200 chip integrated 511 such banks, yielding roughly 200 megabytes of total on-die SRAM and approximately 261,000 processing elements per chip.[^2][^11][^23] The processing elements were optimized for low-precision integer (INT8) multiply-accumulate operations characteristic of [neural network](/wiki/neural_network) inference, and the per-bank RISC core handled control flow, layer fusion, and inter-bank movement.

For the second-generation speedAI240 "Boqueria" chip presented at Hot Chips 2022 on a TSMC 7 nm process, Untether retained the bank concept but doubled the per-bank [RISC-V](/wiki/transformer) control density to two cores per bank, both clocked at approximately 1.35 GHz, yielding 1,458 RISC-V cores in total across the die. The chip carried 238 megabytes of SRAM at roughly 1 petabyte per second of aggregate on-chip memory bandwidth, four 1 MB scratchpads, and two 64-bit-wide LPDDR5 ports for up to 32 gigabytes of external DRAM. The die measured approximately 35 mm by 35 mm and was hosted on a PCIe Gen5 interface for connectivity to a CPU host and for chip-to-chip communication.[^2][^4][^24]

### Numerics and FP8 formats

While runAI200 targeted INT8 inference, the speedAI240 generation added native support for low-precision floating-point types. Untether described two FP8 variants: FP8p ("precision"), a four-mantissa, three-exponent variant tuned for accuracy, and FP8r ("range"), a three-mantissa, four-exponent variant tuned for the dynamic range required by attention layers in [transformer](/wiki/transformer) networks. Untether reported that across a representative inference workload mix, FP8 mode yielded less than one-tenth of one percent of accuracy loss versus BF16, while delivering roughly a four-fold improvement in throughput and energy efficiency over BF16.[^4][^24] The chip also retained BF16 capability for layers that required higher dynamic range, with reported peak performance of 2 PetaFLOPS FP8 and 1 PetaFLOPS BF16.[^4][^24]

### Batch-1 inference and PE density

A distinguishing feature of Untether's design relative to GPU-style accelerators was strong performance at batch size 1, the regime characteristic of latency-sensitive inference such as real-time video analytics, autonomous driving perception stacks, and chatbot-style transformer serving.[^16][^25] Because each PE held its own slice of weights and operated independently, a batch-1 workload could keep the array well-utilized without the wide batching needed to amortize weight-fetch costs on a [Nvidia](/wiki/nvidia) [A100](/wiki/nvidia_a100) or [H100](/wiki/nvidia_h100). Untether published BERT-base inference performance figures of approximately 750 queries per second per watt for the speedAI240, which it characterized as roughly fifteen times the energy efficiency of a then-current state-of-the-art GPU on the same workload.[^2][^4]

### Comparison with adjacent architectures

Untether's at-memory approach occupied a distinct point in the design space versus several other inference-silicon startups of the same era. [Cerebras Systems](/wiki/cerebras) pursued wafer-scale integration with the [WSE-3](/wiki/cerebras_wse_3), placing 900,000 cores and 44 GB of SRAM on a single wafer-sized die for training and inference of very large models. [Groq](/wiki/groq_hardware) built a deterministic, software-scheduled tensor streaming processor (the [Groq LPU](/wiki/groq_lpu)) targeted at low-latency LLM serving with on-chip SRAM rather than HBM. [Tenstorrent](/wiki/tenstorrent) adopted an array of [RISC-V](/wiki/transformer)-controlled tensor cores connected via a flexible network on chip, with separate Wormhole and Blackhole generations targeting both training and inference. Mythic pursued analog in-memory compute using flash cells; EnCharge AI pursued capacitor-based analog matrix multiplication in SRAM. Untether's at-memory architecture was entirely digital, which Snelgrove argued offered superior numerical predictability and process portability versus the analog approaches, at some cost in theoretical density.[^2][^4][^16]

## Hardware product line

### First generation: runAI200 chip and tsunAImi tsn200 card

The runAI200 was Untether's first commercial silicon, fabricated on a TSMC 16-nanometer process. Each chip delivered up to 502 INT8 tera-operations per second and approximately 8 TOPS per watt of typical operating efficiency. The architecture employed 511 banks of approximately 385 KB SRAM each, integrating into roughly 200 megabytes of on-die SRAM.[^3][^11][^23]

In March 2021 Untether disclosed the tsunAImi accelerator card concept publicly: a single PCIe card carrying four runAI200 chips, delivering approximately 2 PetaOPS of aggregate INT8 inference throughput.[^11][^23] The shipping version of the card, branded "tsunAImi tsn200," was announced on September 12, 2023 in a low-profile PCIe form factor: a single-chip card targeting 500 INT8 TOPS at approximately 40 watts for real-time video analytics and edge inference workloads. Untether emphasized its compute density per watt and its small physical footprint, suitable for cost-constrained edge deployments rather than only datacenter racks.[^12][^25]

### Second generation: speedAI240 / "Boqueria"

The speedAI240 chip (internal codename "Boqueria") was unveiled at Hot Chips 34 in August 2022 as Untether's second-generation at-memory compute device.[^4][^24] Fabricated on TSMC 7 nm, the chip targeted approximately 2 PetaFLOPS of FP8 peak performance and approximately 30 TFLOPS per watt, with 238 MB of SRAM, 1,458 RISC-V cores, and 32 GB of LPDDR5 external memory.[^2][^4]

Untether announced that sampling of speedAI240 devices to early-access customers would begin in the first half of 2023.[^4][^24] In June 2023 the IEEE Solid-State Circuits Society's Journal published a peer-reviewed paper, "speedAI240: A 2-Petaflop, 30-Teraflops/W At-Memory Inference Acceleration Device With 1456 RISC-V Cores," describing the second-generation chip.[^26] (Untether's marketing materials and the IEEE paper differ by two on the RISC-V core count, with Untether's press materials citing 1,458 cores and a separate technical paper citing 1,456.)

The speedAI family was packaged as PCIe accelerator cards in two SKUs that Untether began branding for distinct power and performance envelopes: a "Preview" card running the speedAI240 chip at higher power for datacenter deployments, and the "speedAI240 Slim" card targeting a constrained power envelope (typically described as 75-watt-class PCIe form factor) for [edge AI](/wiki/edge_ai) and embedded server deployments.[^13][^27] Untether announced broad availability of the speedAI240 Slim in October 2024.[^13]

### Software: imAIgine SDK

The imAIgine SDK was Untether's developer-facing software stack. It accepted neural network models exported from [PyTorch](/wiki/pytorch) or [TensorFlow](/wiki/tensorflow) (and, through external paths, [ONNX](/wiki/onnx)), and performed a sequence of compiler passes including model [quantization](/wiki/quantization), graph lowering, layer fusion, kernel mapping, and physical allocation across the chip's banks. It included a cycle-accurate simulator, a kernel-level compiler, a code profiler, and a runtime API. Untether emphasized multi-chip partitioning so that networks too large for a single chip's 200-to-238 MB SRAM could be split across multiple speedAI or runAI devices on a card.[^14][^15]

In March 2025 the company announced what it called a "generative compiler" feature inside the imAIgine SDK, claiming roughly four-fold expansion in the number of supported neural network models. The generative compiler automatically synthesized new low-level kernels for layer types that did not match existing hand-written kernels, reducing developer time-to-deployment from days to minutes for new networks.[^28][^15] (Following the AMD transaction in June 2025, Untether discontinued support for the imAIgine SDK; the IP was reported to be among the assets transferred to AMD in the strategic agreement.[^7][^8])

## MLPerf benchmark results

In August 2024 Untether AI submitted results to the MLPerf Inference v4.1 benchmark suite, the principal industry-standard third-party benchmark for inference systems coordinated by MLCommons. The MLCommons release on August 28, 2024 included 964 verified performance results from 22 submitting organizations including [Nvidia](/wiki/nvidia), [AMD](/wiki/amd_instinct_mi300x), Intel, Google Cloud, Cisco Systems, and Dell Technologies. Untether submitted speedAI240 Slim results on the ResNet-50 v1.5 image classification workload in both the Datacenter and Edge categories; in the Edge category Untether reported the highest performance per accelerator, lowest latency, and best energy efficiency among submitted single-accelerator results on that workload.[^29][^13][^30] Untether subsequently used MLPerf results as one of its primary marketing artifacts in its 2024-2025 commercial push toward [edge AI](/wiki/edge_ai) customers.[^13]

It is worth noting that Untether did not submit results on the generative AI workloads in MLPerf Inference v4.1 (Llama-2 70B, Mixtral 8x7B, Stable Diffusion XL), reflecting both the speedAI240's relatively limited per-chip memory (238 MB SRAM plus 32 GB LPDDR5) and its lack of high-bandwidth memory or high-bandwidth chip-to-chip interconnect that would be needed to host large language model weights spanning multiple devices.[^29][^31]

## Partnerships and target markets

Untether's commercial strategy after the Series B emphasized [edge AI](/wiki/edge_ai) and embedded inference rather than direct competition with [Nvidia](/wiki/nvidia)'s datacenter [GPUs](/wiki/gpu) for [LLM](/wiki/llm) training and serving. The two most public partnerships were with General Motors and Arm Holdings.

### General Motors collaboration

On April 28, 2022, Untether and General Motors jointly announced a collaboration to develop next-generation perception systems for [autonomous vehicles](/wiki/autonomous_vehicle) based on Untether's at-memory computation. The project was supported by approximately C$1 million from the Ontario Vehicle Innovation Network (OVIN) R&D Partnership Fund, with an additional C$2.09 million of in-kind industry contribution. GM Ventures had been a strategic investor since 2018, and under the collaboration GM agreed to share neural network models from its existing AV perception stack so Untether could demonstrate equivalent or better inference performance at lower power.[^32][^33]

### Arm collaboration

In 2023, Untether announced a separate collaboration with [Arm Holdings](/wiki/arm_holdings) aimed at integrating Untether's accelerators with Arm-based automotive system-on-chip platforms for advanced driver assistance systems (ADAS) and smart vehicle workloads. The collaboration positioned the speedAI family as a candidate AI accelerator for automotive customers building Arm-based ECUs, although no production design wins from the partnership were publicly disclosed before the company's wind-down.[^34]

### Target verticals

Beyond automotive, Untether marketed the runAI200 / tsunAImi family to real-time video analytics customers (smart city camera deployments, retail analytics, industrial machine vision), and marketed the speedAI240 family to a mix of datacenter inference and embedded server applications including financial services, telecommunications edge, and defense. The "Tracker Capital" investment relationship was widely interpreted in the trade press as opening defense-adjacent customers, given Tracker Capital's stated focus on national security technology, but no specific defense customers were publicly disclosed.[^5][^20]

## Funding history

| Round | Date | Amount | Lead investor(s) | Other participants |
| --- | --- | --- | --- | --- |
| Seed | April 2019 | approx C$13 million | Intel Capital | GM Ventures, Radical Ventures, angels[^17] |
| Series A | Nov 5, 2019 | US$20 million (cumulative) | Intel Capital, Radical Ventures | Existing investors[^18][^19] |
| Series A add-on | 2020 | C$9.4 million | Radical Ventures, Intel Capital | (extension)[^35] |
| Series B | July 20, 2021 | US$125 million (oversubscribed) | Tracker Capital Management, Intel Capital | CPP Investments (new), Radical Ventures[^5][^20] |
| Cumulative | Through 2024 | approximately US$152 million | (Five rounds, four lead/strategic investors)[^6] |

By the time the company entered the AMD transaction in June 2025, no further public funding round had been announced after the 2021 Series B, and trade-press reporting cited the company's inability to raise additional funding earlier in 2025 as a proximate cause of the wind-down.[^7][^8]

## June 2025 wind-down and AMD transaction

On June 5, 2025, AMD and Untether AI separately announced a "strategic agreement" under which AMD acquired Untether's engineering team and selected intellectual property, but did not acquire Untether's corporate entity, its product lines, or its commercial obligations. Untether simultaneously announced that it would discontinue sales and support of the speedAI processor family and the imAIgine SDK.[^7][^8][^9]

Reporting by Reuters, EE Times, SiliconANGLE, HPCwire, BetaKit, Tom's Hardware, and Data Center Dynamics consistently characterized the transaction as an acquihire of "an unknown number" of Untether's roughly 145 employees (per LinkedIn at the time of the announcement), focused on AI compiler and kernel engineers, [RISC-V](/wiki/transformer)-class SoC designers, design-verification engineers, and product-integration staff. AMD's press statement said the transaction "brings a world-class team of engineers to AMD, focused on advancing the company's AI compiler and kernel development capabilities as well as enhancing our digital and SoC design, design verification, and product integration capabilities."[^7][^8][^9][^10][^36]

Several specific details about the deal that emerged from contemporaneous reporting:

- **Outgoing CEO not transferred.** Chris Walker, who had been CEO since January 2024, departed Untether in May 2025 and did not join AMD as part of the transaction.[^8][^9]
- **No product continuity.** AMD explicitly did not commit to continue selling or supporting Untether's speedAI processors or the imAIgine SDK, and Untether announced it would wind down customer support.[^7][^8]
- **Financial terms undisclosed.** Neither AMD nor Untether disclosed the financial value of the deal; one source described to SiliconANGLE estimated a deal worth under US$100 million, with sub-thresholds dependent on how many Untether employees ultimately joined AMD.[^8]
- **AMD's third 2025 acquisition in eight days.** The Untether transaction came eight days after AMD announced acquisitions of Brium (an AI compiler optimization company) and Enosemi (a co-packaged optics company), part of an AMD acceleration of its AI software and silicon depth.[^8][^10]

The Untether shutdown drew commentary from industry analysts characterizing it as one of the first high-profile failures in the 2018-2022 cohort of inference-silicon startups: companies that had launched with substantial venture capital and credible technical teams but had been overtaken by Nvidia's pace on the [transformer](/wiki/transformer) generation and by the abrupt shift in customer demand toward [large language model](/wiki/llm) training and serving rather than the convolutional vision workloads many startups had originally targeted.[^31][^7]

## Limitations and reasons cited for the company's failure

Several limitations of Untether's architecture and commercial strategy were noted both in contemporaneous reviews and in retrospective analyses after the June 2025 shutdown:

### Architectural limitations for LLM workloads

Industry analysts and at least one technical blog post noted that the speedAI240's memory architecture, while ideal for vision inference and batch-1 [BERT](/wiki/bert)-class transformer inference, was poorly suited for [large language model](/wiki/llm) training and serving. Specifically, the chip had approximately 238 MB of on-chip SRAM and 32 GB of LPDDR5 external memory but lacked HBM, had constrained chip-to-chip bandwidth versus contemporary [GPUs](/wiki/gpu) with NVLink-class interconnects, and offered only PCIe Gen5 connectivity for multi-chip scale-out.[^31] Hosting a 70-billion-parameter LLM at typical 8-bit precision required roughly 70 GB of weight memory plus KV-cache, well beyond what a single speedAI240 card could support, and the lack of high-bandwidth chip-to-chip links made the multi-card scaling required for serving large transformers slow relative to NVLink or Infinity Fabric-class fabrics on [Nvidia H100](/wiki/nvidia_h100) and [AMD MI300X](/wiki/amd_instinct_mi300x) systems.[^31]

### Market timing

Untether had been founded in 2018, several years before the November 2022 release of ChatGPT and the subsequent industry-wide shift in inference-silicon demand toward LLM serving. The company's original product hypothesis (vision inference and edge AI for autonomous vehicles, smart cities, and industrial automation) was articulated when the prevailing inference workloads were [ResNet](/wiki/resnet)-class CNNs and [BERT](/wiki/bert)-class encoders rather than autoregressive decoder transformers. As a retrospective analysis published after the shutdown observed, the company's roadmap and its memory architecture were "frozen" in a pre-LLM era at exactly the moment customer demand pivoted.[^31]

### Cost of capital and capital intensity

Building two generations of leading-edge silicon (TSMC 16 nm and TSMC 7 nm) plus a custom compiler stack consumed Untether's roughly US$152 million in funding without a path to substantial commercial revenue by 2024-2025. Trade-press accounts reported that Untether sought additional funding in early 2025 and was unable to close a round under the rapidly tightening AI inference startup environment, in which Nvidia's market share and AMD's MI300X ramp had compressed valuations and customer interest in unproven silicon alternatives.[^7][^8]

### Customer concentration risk

Untether's most public customer-facing relationships (General Motors for automotive perception, Arm for ADAS reference platforms) were development collaborations rather than committed production design wins, with no public disclosure of bill-of-materials inclusion or production volumes. Without a anchor production customer to absorb at-scale silicon volume, the company faced a recurring difficulty common to inference-silicon startups in the 2018-2022 cohort.[^31][^9]

## Competitive context

Untether operated in an unusually crowded competitive landscape during 2018-2025. Its closest peers, by architecture or customer overlap, included:

| Competitor | Approach | Status (mid-2025) |
| --- | --- | --- |
| [Cerebras Systems](/wiki/cerebras) | Wafer-scale digital ([WSE-3](/wiki/cerebras_wse_3)) | Independent; pivoted to inference cloud and IPO track |
| [Groq](/wiki/groq_hardware) | Software-scheduled deterministic tensor streaming ([LPU](/wiki/groq_lpu)) | Independent; LLM inference cloud focus |
| [Tenstorrent](/wiki/tenstorrent) | RISC-V-controlled tensor cores (Wormhole/[Blackhole](/wiki/blackhole_tenstorrent)) | Independent; raised significant funding 2024-2025 |
| [SambaNova Systems](/wiki/sambanova) | Reconfigurable dataflow units (RDU; [SN40L](/wiki/sambanova_sn40l)) | Independent; enterprise model serving |
| Mythic | Analog in-memory compute (flash) | Multiple restructurings; reduced footprint by 2025 |
| EnCharge AI | Analog in-memory compute (SRAM, capacitor-based) | Independent; defense and edge focus |
| Untether AI | Digital at-memory compute (SRAM-adjacent PEs) | Hardware business wound down June 2025; team to [AMD](/wiki/amd_instinct_mi300x) |

In retrospective coverage, several analysts characterized 2024-2025 as the beginning of consolidation in the inference-silicon market, with Untether's June 2025 wind-down as one of the first prominent exits among VC-backed startup peers. AMD's strategy of acquiring engineering talent and selective IP from declining peers (Brium, Enosemi, Untether) within a single eight-day window in late May and early June 2025 was widely interpreted as a deliberate acquihire campaign to close engineering gaps in AMD's AI inference stack versus [Nvidia](/wiki/nvidia).[^8][^10]

## Significance

Despite its wind-down, Untether AI's seven-year run left several artifacts of technical significance for the broader history of AI accelerator design:

1. **Validation of digital at-memory compute at production scale.** The runAI200 and speedAI240 chips demonstrated that digital SRAM-adjacent compute can be implemented in mainstream silicon processes ([TSMC](/wiki/tsmc) 16 nm and 7 nm) at scale, achieving roughly an order-of-magnitude advantage in energy per inference operation over GPU baselines on suitable workloads.[^2][^4]
2. **Public characterization of FP8 numerics.** Untether's FP8p/FP8r split, presented at Hot Chips 2022, was among the earlier public characterizations of dual FP8 formats tuned for accuracy versus dynamic range, predating the IEEE FP8 standardization activity and complementing parallel work by [Nvidia](/wiki/nvidia) (E4M3/E5M2) and other vendors.[^4][^24]
3. **MLPerf Edge participation.** Untether's speedAI240 Slim submissions to MLPerf Inference v4.1 contributed to the third-party verifiable benchmark record for [edge AI](/wiki/edge_ai) accelerators outside the dominant GPU paradigm.[^29][^13]
4. **AMD AI engineering depth.** The transfer of Untether's compiler, kernel, and SoC design engineers to AMD was characterized by both AMD and industry analysts as materially strengthening AMD's AI software stack and SoC verification capability at exactly the moment AMD was scaling out its Instinct GPU family.[^7][^8][^10]

## Related work

- [AI accelerator](/wiki/ai_chip) (general category)
- [Edge AI](/wiki/edge_ai)
- [Graphics processing unit](/wiki/gpu) / [GPU computing](/wiki/gpu_computing)
- [MLPerf](/wiki/mlperf) benchmark
- [TSMC](/wiki/tsmc) process technology
- [Quantization](/wiki/quantization) (numerical formats for inference)
- [Transformer](/wiki/transformer) inference workloads
- [BERT](/wiki/bert) (a representative encoder benchmark)
- [Autonomous vehicle](/wiki/autonomous_vehicle) perception systems

## See also

- [Groq](/wiki/groq_hardware)
- [Groq LPU](/wiki/groq_lpu)
- [Cerebras Systems](/wiki/cerebras)
- [Cerebras WSE-3](/wiki/cerebras_wse_3)
- [Tenstorrent](/wiki/tenstorrent)
- [Wormhole (Tenstorrent)](/wiki/wormhole_tenstorrent)
- [Blackhole (Tenstorrent)](/wiki/blackhole_tenstorrent)
- [SambaNova Systems](/wiki/sambanova)
- [SambaNova SN40L](/wiki/sambanova_sn40l)
- [AMD Instinct MI300X](/wiki/amd_instinct_mi300x)
- [Nvidia](/wiki/nvidia)
- [Arm Holdings](/wiki/arm_holdings)
- [MLPerf](/wiki/mlperf)
- [Edge AI](/wiki/edge_ai)
- [AI accelerator](/wiki/ai_chip)
- [ONNX](/wiki/onnx)
- [PyTorch](/wiki/pytorch)
- [TensorFlow](/wiki/tensorflow)

## References

[^1]: Untether AI, "At-Memory Architecture", Untether AI Products / Technology, 2024. https://www.untether.ai/products/technology/. Accessed 2026-05-20.

[^2]: Patrick Kennedy, "Untether.AI Boqueria 1458 RISC-V Core AI Accelerator", ServeTheHome, 2022-08-23. https://www.servethehome.com/untether-ai-boqueria-1458-risc-v-core-ai-accelerator-hc34/. Accessed 2026-05-20.

[^3]: Nitin Dahad, "Canadian AI Startup Presents PetaOPS Card", EE Times, 2020-10-29. https://www.eetimes.com/canadian-ai-startup-presents-petaops-card/. Accessed 2026-05-20.

[^4]: Timothy Prickett Morgan, "Untether AI Pulls the Curtain Rope For Its Next-Gen Inferencing System", The Next Platform, 2022-08-23. https://www.nextplatform.com/2022/08/23/untether-ai-pulls-the-curtain-rope-for-its-next-gen-inferencing-system/. Accessed 2026-05-20.

[^5]: Untether AI, "Untether AI Announces Oversubscribed $125 Million Funding to Deploy High-Performance AI Inference Acceleration Chips", Business Wire / Untether AI press release, 2021-07-20. https://www.businesswire.com/news/home/20210720005730/en/Untether-AI-Announces-Oversubscribed-$125-Million-Funding-to-Deploy-High-Performance-AI-Inference-Acceleration-Chips. Accessed 2026-05-20.

[^6]: Crunchbase, "Untether AI Company Profile & Funding", Crunchbase, 2025. https://www.crunchbase.com/organization/untether-ai. Accessed 2026-05-20.

[^7]: Sally Ward-Foxton, "Untether AI Shuts Down, Engineering Team Joins AMD", EE Times, 2025-06-05. https://www.eetimes.com/untether-ai-shuts-down-engineering-team-joins-amd/. Accessed 2026-05-20.

[^8]: Maria Deutscher, "AMD bags team behind AI chipmaker Untether AI, its third acquisition in under two weeks", SiliconANGLE, 2025-06-05. https://siliconangle.com/2025/06/05/amd-bags-team-behind-ai-chipmaker-untether-ai-third-acquisition-two-weeks/. Accessed 2026-05-20.

[^9]: Untether AI, "Untether AI Corp. Has Entered Into a Transaction with AMD", Untether AI press release, 2025-06-05. https://www.untether.ai/untether-ai-corp-has-entered-into-a-transaction-with-amd/. Accessed 2026-05-20.

[^10]: Madison Whyte, "American chip giant AMD to acquire Untether AI team", BetaKit, 2025-06-05. https://betakit.com/american-chip-giant-amd-to-acquire-untether-ai-team/. Accessed 2026-05-20.

[^11]: Chris Mellor, "Untether AI tethers compute cores inside memory array", Blocks and Files, 2021-03-25. https://blocksandfiles.com/2021/03/25/untether-ai-tsunaimi-pcie-card/. Accessed 2026-05-20.

[^12]: Untether AI, "Untether AI Ships the tsunAImi tsn200 Accelerator Card, Delivering High Performance Inference Beyond the Data Center", Business Wire / Untether AI press release, 2023-09-12. https://www.businesswire.com/news/home/20230912152187/en/Untether-AI-Ships-the-tsunAImi-tsn200-Accelerator-Card-Delivering-High-Performance-Inference-Beyond-the-Datacenter. Accessed 2026-05-20.

[^13]: Untether AI, "Untether AI Ships speedAI240 Slim: World's Fastest, Most Energy Efficient AI Inference Accelerator for Cloud to Edge Applications", Untether AI press release, 2024-10-28. https://www.untether.ai/untether-ai-ships-speedai240-slim-worlds-fastest-most-energy-efficient-ai-inference-accelerator-for-cloud-to-edge-applications/. Accessed 2026-05-20.

[^14]: Untether AI, "Untether AI Releases Early Access to imAIgine Software Development Kit Supporting speedAI Inference Acceleration Solutions", Business Wire, 2024-07-17. https://www.businesswire.com/news/home/20240717282026/en/Untether-AI-Releases-Early-Access-to-imAIgine-Software-Development-Kit-Supporting-speedAI-Inference-Acceleration-Solutions. Accessed 2026-05-20.

[^15]: Untether AI, "Untether AI Increases Developer Velocity and Adds High-Performance Compute Flow to the imAIgine Software Development Kit", Business Wire, 2023-01-17. https://www.businesswire.com/news/home/20230117005711/en/Untether-AI-Increases-Developer-Velocity-and-Adds-High-Performance-Compute-Flow-to-the-imAIgine-Software-Development-Kit. Accessed 2026-05-20.

[^16]: Electronic Products & Technology, "Untether AI chip aims to accelerate innovation", EPT, 2019-04. https://www.ept.ca/2019/04/untether-ai-chip-aims-to-accelerate-innovation/untether-ai-ceo-martin-snelgrove/. Accessed 2026-05-20.

[^17]: Untether AI, "Untether AI Raises $13 Million from Intel Capital and Other Investors to Accelerate AI Innovation", Untether AI press release, 2019-04. https://www.untether.ai/untether-ai-raises-13-million-from-intel-capital-and-other-investors-to-accelerate-ai-innovation/. Accessed 2026-05-20.

[^18]: Untether AI, "Untether AI Announces $20 Million Series A and Appoints Technology Veteran Arun Iyengar as CEO", Untether AI press release, 2019-11-05. https://www.untether.ai/untether-ai-announces-20-million-series-a-and-appoints-technology-veteran-arun-iyengar-as-ceo/. Accessed 2026-05-20.

[^19]: Intel Capital, "AI startup Untether AI announces $20 million Series A", Intel Capital, 2019-11-05. https://www.intelcapital.com/ai-startup-untether-ai-announces-20-million-series-a/. Accessed 2026-05-20.

[^20]: Private Capital Journal, "Untether AI secures US $125M Series B led by Tracker Capital and Intel Capital", Private Capital Journal, 2021-07-20. https://privatecapitaljournal.com/untether-ai-secures-us-125m-series-b-led-by-tracker-capital-and-intel-capital/. Accessed 2026-05-20.

[^21]: HPCwire, "Untether AI Names Chris Walker as CEO, Dr. Amir Salek Joins Board During Period of Steady Growth", HPCwire, 2024-01. https://www.hpcwire.com/off-the-wire/untether-ai-names-chris-walker-as-ceo-dr-amir-salek-joins-board-during-period-of-steady-growth/. Accessed 2026-05-20.

[^22]: Sally Ward-Foxton, "Untether AI Appoints New CEO, at 'Transition Point'", EE Times, 2024-01. https://www.eetimes.com/untether-ai-appoints-new-ceo-at-transition-point/. Accessed 2026-05-20.

[^23]: Untether AI, "Untether AI Ushers in the PetaOps Era with At-Memory Computation for AI Inference Workloads", Untether AI press release, 2020-10-28. https://www.untether.ai/untether-ai-ushers-in-the-petaops-era-with-at-memory-computation-for-ai-inference-workloads/. Accessed 2026-05-20.

[^24]: Untether AI, "Untether AI Unveils Its Second-Generation At-Memory Compute Architecture at HOT CHIPS 2022", Business Wire, 2022-08-23. https://www.businesswire.com/news/home/20220823005631/en/Untether-AI-Unveils-Its-Second-Generation-At-Memory-Compute-Architecture-at-HOT-CHIPS-2022. Accessed 2026-05-20.

[^25]: EE Journal, "Untether AI Ships the tsunAImi tsn200 Accelerator Card, Delivering High Performance Inference Beyond the Datacenter", EE Journal, 2023-09-12. https://www.eejournal.com/industry_news/untether-ai-ships-the-tsunaimi-tsn200-accelerator-card-delivering-high-performance-inference-beyond-the-datacenter/. Accessed 2026-05-20.

[^26]: IEEE Solid-State Circuits Society, "speedAI240: A 2-Petaflop, 30-Teraflops/W At-Memory Inference Acceleration Device With 1456 RISC-V Cores", IEEE Journal of Solid-State Circuits, 2023. https://ieeexplore.ieee.org/document/10066167/. Accessed 2026-05-20.

[^27]: TechInsights, "Untether Boqueria Targets AI Lead", TechInsights, 2022. https://www.techinsights.com/blog/untether-boqueria-targets-ai-lead. Accessed 2026-05-20.

[^28]: Untether AI, "Untether AI Dramatically Expands AI Model Support and Speeds Developer Velocity with New Generative Compiler Technology", Business Wire, 2025-03-10. https://www.businesswire.com/news/home/20250310953294/en/Untether-AI-Dramatically-Expands-AI-Model-Support-and-Speeds-Developer-Velocity-with-New-Generative-Compiler-Technology. Accessed 2026-05-20.

[^29]: MLCommons, "New MLPerf Inference v4.1 Benchmark Results Highlight Rapid Hardware and Software Innovations in Generative AI Systems", MLCommons, 2024-08-28. https://mlcommons.org/2024/08/mlperf-inference-v4-1-results/. Accessed 2026-05-20.

[^30]: Untether AI, "Untether AI Sets New MLPerf Records with speedAI240 Accelerator Cards", Untether AI press release, 2024-08-28. https://www.untether.ai/untether-ai-sets-new-mlperf-records-with-speedai240-accelerator-cards/. Accessed 2026-05-20.

[^31]: Zach Cmiel, "Why did Untether AI fail?", zach.be blog, 2025-06. https://www.zach.be/p/why-did-untether-ai-fail. Accessed 2026-05-20.

[^32]: Untether AI, "Untether AI and General Motors to Develop Next-Generation Autonomous Vehicle Perception Systems", Business Wire, 2022-04-28. https://www.businesswire.com/news/home/20220428005201/en/Untether-AI-and-General-Motors-to-Develop-Next-Generation-Autonomous-Vehicle-Perception-Systems. Accessed 2026-05-20.

[^33]: Green Car Congress, "Untether AI and GM to develop next-generation autonomous vehicle perception systems; 'at-memory computation'", Green Car Congress, 2022-04-29. https://www.greencarcongress.com/2022/04/20220429-untether.html. Accessed 2026-05-20.

[^34]: BetaKit, "Untether AI to collaborate with semiconductor giant Arm on smart vehicle solutions", BetaKit, 2023. https://betakit.com/untether-ai-to-collaborate-with-semiconductor-giant-arm-on-smart-vehicle-solutions/. Accessed 2026-05-20.

[^35]: BetaKit, "Untether AI closes additional $9.4 million CAD from Radical Ventures, Intel Capital", BetaKit, 2020. https://betakit.com/untether-ai-closes-additional-9-4-million-cad-from-radical-ventures-intel-capital/. Accessed 2026-05-20.

[^36]: HPCwire, "Untether AI Team Acquired by AMD, Product Support Discontinued", HPCwire, 2025-06-05. https://www.hpcwire.com/off-the-wire/untether-ai-team-acquired-by-amd-product-support-discontinued/. Accessed 2026-05-20.

