Trusted Execution Environments for machine learning

AI Hardware AI Infrastructure

22 min read

Updated Jul 23, 2026

Suggest edit History Talk

RawGraph

Last edited

Jul 23, 2026

Fact-checked

In review queue

Sources

36 citations

Revision

v4 · 4,496 words

Fact-checks are independent of edits: a reviewer re-verifies the article against its sources and stamps the date. How we verify

Trusted Execution Environments for machine learning (TEEs for ML, sometimes marketed as "Confidential AI" or "confidential inference") are deployments of hardware-isolated execution environments to run machine-learning training or inference workloads such that the data, model weights, and code are confidential from the cloud or hardware operator and integrity-attestable to the end user. A TEE provides cryptographically attested isolation enforced by silicon: memory encryption keeps secrets out of the hypervisor and other tenants, while remote attestation lets an external party verify that a genuine, unmodified TEE is executing a specific binary before releasing keys, data, or model weights to it.^[1]^[2] The category combines CPU TEEs (Intel SGX, Intel TDX, AMD SEV-SNP, ARM TrustZone, ARM CCA, AWS Nitro Enclaves) with newer GPU TEEs introduced on the NVIDIA Hopper H100 and Blackwell B200 platforms, plus orchestration stacks (Confidential Containers, Edgeless Marblerun, Fortanix Confidential AI, Anjuna Seaglass) and bespoke implementations such as Apple Private Cloud Compute and Anthropic's Confidential Inference service.^[3]^[4]^[5]^[6]^[7]

The motivation is sharp: frontier LLM inference and fine-tuning increasingly involve sensitive enterprise data running against proprietary weights on third-party clouds, creating threat models in which the cloud provider, host kernel, co-tenant VMs, and even the model owner's own operators are all potential adversaries. TEEs collapse that trust boundary to the silicon and a small attestable software base, providing a middle ground between leaving data plaintext on a hyperscaler and the still-impractical performance overheads of fully homomorphic encryption.^[8] As of May 2026, confidential GPU offerings backed by NVIDIA H100, H200 and Blackwell B200 are generally available across Microsoft Azure, Google Cloud, and several specialty providers, and Anthropic, Apple, and Fortanix have all announced production TEE-based inference systems.^[4]^[5]^[6]^[9]

Background

Origins of confidential computing

The general idea of a hardware-protected enclave predates ML by decades. ARM TrustZone, shipped in Cortex-A processors from the mid-2000s, partitions a system into "Secure World" and "Normal World," with every bus transaction tagged by an NS bit and transitions mediated by the Secure Monitor Call (SMC) instruction.^[10] TrustZone enabled mobile DRM, secure boot, and trusted application stacks such as OP-TEE, but its single-secure-world model is not naturally suited to multi-tenant cloud workloads.

Intel introduced Software Guard Extensions (SGX) in 2015 with Skylake processors. SGX defines user-mode "enclaves," ring-3 regions of encrypted memory whose contents are decrypted only inside the CPU package and are opaque to the operating system, hypervisor, and System Management Mode firmware.^[11] SGX shipped with a measurement-and-attestation protocol (MRENCLAVE, MRSIGNER, and quoting enclaves issuing reports signed under Intel's EPID or DCAP infrastructure), which made it the first commodity hardware to support practical remote attestation of arbitrary code.^[11] AMD followed in 2016 with Secure Encrypted Virtualization (SEV) on EPYC, encrypting guest VM memory under a per-VM key managed by the AMD Secure Processor. SEV-ES (Encrypted State, 2017) added register-state encryption, and SEV-SNP (Secure Nested Paging, 2020) added integrity protection against replay, remapping, and aliasing by the malicious hypervisor.^[12]

The Confidential Computing Consortium was established under the Linux Foundation on 17 October 2019 with founding premier members Alibaba, ARM, Google Cloud, Huawei, Intel, Microsoft, and Red Hat, and general members including Baidu, ByteDance, Fortanix, Oasis Labs, Swisscom, Tencent, and VMware.^[13] It defines confidential computing as "the protection of data in use by performing computation in a hardware-based, attested trusted execution environment."^[1]

From per-process enclaves to confidential VMs

SGX's small enclave-page-cache (originally 128 MB) and the cost of context-switching across the enclave boundary made it awkward to run large applications, and an unending stream of microarchitectural attacks (Foreshadow / L1TF, ZombieLoad, LVI, SGAxe, ÆPIC Leak, SGX.fail) eroded user confidence.^[14]^[15] Intel responded with Trust Domain Extensions (TDX), a hypervisor-grade TEE that wraps an entire guest VM as a "trust domain" with encrypted memory and protected register state, and which became generally available with 5th-generation Xeon Scalable processors (Emerald Rapids) on 14 December 2023.^[16]^[17] In parallel, Intel announced that SGX was deprecated on 11th- and 12th-generation Core client CPUs starting in 2021, while keeping it active on the Xeon server line.^[18]

ARM introduced the Realm Management Extension (RME) in the Armv9-A architecture, the principal hardware feature underlying ARM Confidential Compute Architecture (CCA). CCA adds a fourth "Realm World" alongside Normal, Secure, and Root worlds, allowing lower-privileged software to protect itself even from a malicious hypervisor.^[19]^[20]

AWS shipped Nitro Enclaves in October 2020 as an EC2 feature that carves an isolated, network-free, storage-free child VM out of a parent instance, communicating only over a vsock channel and attestable through a Nitro hypervisor signing service. Nitro Enclaves does not rely on Intel SGX or AMD SEV but instead on the Nitro System's own hypervisor isolation.^[21]

From CPU enclaves to GPU TEEs

Until 2022, confidential computing was effectively CPU-only, and any workload that touched a GPU left the trust boundary the moment data crossed PCIe. NVIDIA changed that with H100 (Hopper, GA March 2023), the first GPU with a hardware Trusted Execution Environment. H100 boots into a "Confidential Computing" mode under control of an on-die security processor, runs a measured firmware, encrypts and signs all command buffers and CUDA kernels that cross the PCIe bus, and produces a GPU attestation report signed by a per-chip key rooted in NVIDIA's PKI.^[3]^[22] Data transit between CPU and GPU uses an encrypted "bounce buffer" in shared memory, since standard PCIe is not natively encrypted.^[22]

NVIDIA Blackwell (announced GTC March 2024) extended this with the first TEE-I/O capable GPU, providing inline link-level encryption over NVLink and NVSwitch, eliminating the bounce-buffer overhead and bringing multi-GPU confidential workloads under the same attested envelope.^[4] AMD's MI300X takes a different approach: rather than a GPU-internal TEE, the GPU is admitted into the host's SEV-SNP envelope through Trusted I/O (TDISP/SEV-TIO), with attestation produced at the VM level.^[23]

Technical details

TEE primitives

Every TEE provides three core primitives:

Sealed memory. Pages belonging to the TEE are encrypted by hardware (commonly AES-XTS or AES-GCM with a per-VM or per-enclave key) under a key that never leaves the silicon. The CPU enforces that loads and stores from outside the TEE either fault or read ciphertext.
Remote attestation. The hardware measures the initial state of the TEE (firmware version, loaded image hash, configuration) and signs a report with a hardware-attestable key. A verifier checks the signature against a vendor PKI, validates that the measurement matches the expected workload, and only then provisions secrets or data.^[24]
Secure I/O. Because most useful workloads touch storage, network, or accelerators, modern TEEs add mechanisms for encrypted DMA, vsock-style local channels, or PCIe link encryption so that data exiting the encrypted memory region remains protected end to end.

CPU TEEs

TEE	Vendor	Granularity	Year first GA	Threat model	Notable attacks
TrustZone	ARM	Whole device (Secure World)	~2005	Untrusted Normal World OS	Various TA-specific
SGX	Intel	Per-process enclave	2015	Untrusted OS / hypervisor	Foreshadow, LVI, ÆPIC, SGX.fail
SEV / SEV-ES / SEV-SNP	AMD	Whole guest VM	2016 / 2017 / 2020	Untrusted hypervisor	SEV-step, WeSee, BadRAM
TDX	Intel	Whole guest VM (trust domain)	2023 (5th-gen Xeon)	Untrusted hypervisor, SMM	TDX-Down (mitigated)
CCA / RME	ARM	Realm VM	2024 (Neoverse V3 / N3)	Untrusted hypervisor	Research evolving
Nitro Enclaves	AWS	Child VM	2020	Untrusted parent instance	Parent-side oracle attacks

SEV-SNP attestation is grounded in the Versioned Chip Endorsement Key (VCEK), a per-chip key derived from a unique secret and the reported TCB versions (SP bootloader SVN, SP OS SVN, SNP firmware SVN, microcode patch level). VCEK certificates chain to the AMD SEV CA and an AMD Root Key, allowing a remote verifier to confirm that the report was produced by a specific genuine EPYC at a specific patch level.^[25] Intel TDX uses an analogous DCAP-style quoting infrastructure with provisioning certification keys signed by Intel.

GPU TEEs

NVIDIA's H100 supports two relevant modes. In Single GPU Passthrough Confidential Computing (SPT-CC) a single H100 is passed through to one Confidential VM, the GPU boots a measured firmware, and CPU-GPU data crosses an encrypted bounce buffer in shared system memory. All command buffers and CUDA kernels are encrypted and signed before crossing PCIe.^[22]^[26] In Multi-Instance GPU CC (MIG-CC) the GPU's MIG partitions each get their own confidential context. The GPU attestation report is signed by the on-die security processor and includes a measurement of the booted GPU firmware and the VBIOS.^[3]

Performance studies show that H100 CC mode imposes modest overhead. Zhu et al. (2024, arXiv:2409.03992) benchmark Llama-2 / Llama-3 inference on H100 and find total overhead below ~7% for typical LLM queries, with overhead approaching zero on Llama-3.1-70B and other large models because compute dominates I/O.^[26] Time to first token (TTFT) carries larger relative overhead than inter-token latency, confirming that the bottleneck is encrypted PCIe traffic, not in-GPU computation.^[26] Subsequent work by Corvex on HGX B200 reports near-identical throughput to non-CC mode because Blackwell encrypts NVLink and NVSwitch inline rather than going through a bounce buffer.^[4]

Attestation flow

A canonical confidential-inference attestation flow works as follows:

The cloud control plane launches a confidential VM (SEV-SNP or TDX) with a measured guest image and a pass-through GPU in CC mode.
At guest boot, the VM's vTPM (a virtual TPM rooted in the platform) extends Platform Configuration Registers (PCRs) with each loaded component.
The guest requests an attestation report from the SEV-SNP / TDX hardware (containing a measurement of the launch state and a nonce supplied by the verifier) and a GPU attestation report from the GPU's security processor.
The verifier (Azure Attestation, AWS NCC, Google Confidential Space verifier, or a customer-controlled service) checks both signatures against vendor PKIs, compares measurements to a known-good policy, and on success releases a workload key from a key-management system (KMS) into the VM.
The VM uses the key to decrypt model weights, fetch encrypted user data, run inference, and return ciphertext.

This pattern is implemented in production by Azure Confidential VM Guest Attestation,^[27] Google Confidential Space,^[28] AWS KMS Attestation for Nitro Enclaves,^[21] Apple Private Cloud Compute,^[5] and the Anthropic Confidential Inference architecture.^[6]

Implementations and adoption

Cloud provider offerings

Microsoft Azure offers AMD SEV-SNP confidential VMs on the DCasv5 / ECasv5 family (3rd-gen AMD EPYC) and Intel TDX confidential VMs on the DCesv5 / ECesv5 family (5th-gen Xeon), both with guest-attestation hooks against Microsoft Azure Attestation and Azure Key Vault.^[29] Azure announced general availability of Intel TDX confidential VMs in early 2026 and provides NVIDIA H100 Confidential Computing through the NCC H100 v5 series under AMD SEV-SNP, plus an Intel TDX + H100 / H200 stack as a regional option.^[29]

Google Cloud runs Confidential VMs on AMD SEV, SEV-SNP, and Intel TDX, layered with Confidential Space, a hardened Container-Optimized OS image that runs a workload container inside a confidential VM and integrates with Workload Identity Federation so secrets are only released to attested workloads. Confidential Space reached general availability in 2023 and added Intel TDX with NVIDIA Confidential Computing in preview in 2024.^[30]^[28]

Amazon Web Services offers SEV-SNP on M7a and other 4th-gen EPYC instance types,^[31] alongside Nitro Enclaves on most EC2 instances. Nitro Enclaves are widely used for cryptographic key handling and have been adopted by Anjuna for its Seaglass platform.^[32] AWS does not currently expose Intel TDX in EC2.

ML-specific stacks

Apple Private Cloud Compute (PCC), announced 10 June 2024 alongside Apple Intelligence, runs LLM inference for Apple Intelligence on custom Apple-silicon servers. The published architecture has five requirements: stateless computation, enforceable guarantees, no privileged runtime access, non-targetability, and verifiable transparency, the last of which is implemented by publishing every production PCC build image and inviting researchers to verify that the running software matches the published source.^[5] PCC uses Apple's Secure Enclave technology and a hardened iOS/macOS subset; it is one of the largest-scale TEE-based ML deployments outside of cloud-provider primitives.

Anthropic Confidential Inference via Trusted Virtual Machines, described in a paper by Anthropic and Pattern Labs (now Irregular) published 18 June 2025, runs inference for selected Claude customers inside SEV-SNP / TDX confidential VMs with NVIDIA H100 / H200 GPUs in CC mode. The design isolates a minimal "model loader" inside the TEE that decrypts weights only after attestation, signs releases via a CI server, and pairs the inference VM with an API server that handles user-data decryption. The paper explicitly maps its design against RAND's Security Level 4 (SL4) and Security Level 5 (SL5) weight-protection tiers for frontier-model security.^[6]

Fortanix Confidential AI wraps NVIDIA Confidential Computing GPUs (H100, H200, Blackwell) with a key-management and attestation control plane targeted at enterprise model serving; in October 2025 Fortanix announced a joint solution with NVIDIA for "agentic AI" on confidential GPUs, and in March 2026 announced a Confidential AI release covering both inference and model-IP protection in enterprise AI factories.^[7]

Edgeless Marblerun is an open-source service mesh for confidential workloads, originally built for Intel SGX enclaves and now extended to Intel TDX and AMD SEV-SNP via the related Contrast project. Marblerun supplies one cluster-wide attestation per manifest, plus mTLS provisioning and secret distribution to "Marbles" running inside enclaves built with EGo, Gramine, or Occlum.^[33]

Anjuna Seaglass is a commercial platform that lifts unmodified Linux applications into Intel SGX/TDX, AMD SEV-SNP, or AWS Nitro Enclaves without code changes, and is one of the more prominent vendors integrating AWS Nitro Enclaves with Kubernetes-style deployments.^[32]

Confidential Containers (CoCo) is a CNCF sandbox project that runs each Kubernetes pod inside its own TEE using a Kata Containers microVM as the carrier, supporting Intel TDX, AMD SEV-ES/SNP, and IBM Secure Execution. CoCo's in-guest components (Kata Agent, Attestation Agent, Confidential Data Hub) handle attestation, image decryption, and key release, dramatically shrinking the Trusted Compute Base relative to a normal Kubernetes node.^[34]

Comparative summary

System	TEE technology	Workload	Status (2026-05)
Apple Private Cloud Compute	Apple-silicon Secure Enclave	Apple Intelligence inference	Production since iOS 18 (2024)
Anthropic Confidential Inference	SEV-SNP / TDX + H100/H200 CC	Selected Claude inference	Architecture published Jun 2025
Azure Confidential GPU	SEV-SNP + H100 CC	Customer AI workloads	General availability
Google Confidential Space + CC GPU	TDX + H100 CC	Customer AI workloads	TDX+H100 in preview
AWS Nitro Enclaves	Nitro hypervisor	Generic confidential workloads	GA since 2020
Fortanix Confidential AI	NVIDIA CC GPUs + KMS	Enterprise model IP	GA
Edgeless Marblerun / Contrast	SGX, TDX, SEV-SNP	Service mesh for confidential apps	Open source
Confidential Containers	Kata + TDX / SEV-ES / SE	Confidential Kubernetes	CNCF sandbox

Applications

Confidential inference

The most mature application is "confidential inference": a customer sends prompts to a model deployed inside a TEE, with hardware attestation guaranteeing that neither the model provider nor the cloud operator can see plaintext requests or outputs. This is the architecture used in Apple PCC, Anthropic Confidential Inference, and Azure / Google / Fortanix offerings.^[5]^[6]^[7] Recent benchmark work shows that H100 CC adds only a few percent of latency to typical LLM queries, making it economically tractable for production deployment.^[26]

Model-IP protection

A symmetric concern is protecting proprietary model weights from the customers running them. If a model owner pushes weights to an enterprise cluster, even legitimate enterprise admins can in principle exfiltrate the weights. TEE-based deployment lets the model owner encrypt weights with a key that is only released after the customer's hardware attests to running the agreed-on inference binary, blocking trivial extraction. Anthropic explicitly cites RAND's SL4 / SL5 weight-protection tiers as the bar its confidential inference design aims to meet.^[6]

Confidential training and federated learning

Several research and industry deployments use TEEs to support privacy-preserving collaborative training: data from multiple parties is uploaded encrypted to a TEE, the TEE attests to an agreed training recipe, decrypts the data only inside the encrypted boundary, and outputs only a trained model (or its gradients). This is positioned as a counterpart to federated learning approaches and is implemented in production by Google Confidential Space's "joint data analysis and ML training" pattern.^[28]

Mitigating data-side privacy attacks

TEEs do not directly defend against statistical privacy leaks from a model's outputs (e.g. membership inference or training-data extraction), but they complement those defenses by ensuring that the attacker cannot bypass the model's API by directly reading server memory. Combined with differential privacy noise on training, TEEs are the standard hardware leg of a defense-in-depth privacy stack.

Compute provenance and AI governance

A less-explored but rapidly emerging application is using attested TEEs as a substrate for AI compute governance: regulators or evaluators can require that frontier training or inference happen inside attestable hardware, so that they can later verify a workload's identity and policy compliance without trusting the operator's claims. Anthropic's Confidential Inference paper notes this as a motivation: confidential computing provides a cryptographic chain of trust that "attests to software security and enforces rules about which software is allowed to use encryption keys."^[6]

Limitations

Performance overhead

Performance overhead has fallen rapidly but is not free. For LLM inference on H100 CC mode, published benchmarks show 4 to 8 percent throughput loss at small batches, falling to under 1 percent for large models or long sequences, because the encrypted bounce buffer is amortized across longer compute.^[26] Training workloads, which are more I/O-heavy and rely on collective operations across many GPUs, historically suffered worse penalties under H100 because each NVLink crossing exited the TEE. NVIDIA Blackwell's TEE-I/O extends encryption across NVLink and NVSwitch, eliminating the bounce-buffer cost and reportedly delivering near-identical throughput to non-CC mode.^[4]

For SGX-style per-process enclaves, classical applications saw 30 to 200 percent overheads depending on enclave page misses, and the original 128 MB enclave page cache forced costly EPC paging on larger working sets. These limitations are part of why the industry has shifted toward VM-grade TEEs (TDX, SEV-SNP) for ML workloads.

Side-channel and microarchitectural attacks

Hardware TEEs are subject to an active research front of side-channel and microarchitectural attacks. Notable examples include:

Foreshadow / L1TF (2018): speculative execution leak that read SGX enclave memory via L1 cache.
LVI (Load Value Injection, 2020): a "reverse Meltdown" that injects attacker-controlled values into transiently executed loads, bypassing earlier Spectre / Meltdown mitigations and requiring expensive compiler-side fences in SGX code.^[14]
ÆPIC Leak (CVE-2022-21233, 2022): the first architectural (non-side-channel) SGX leak, in which the APIC MMIO undefined range on 10th-12th-generation Intel CPUs returned stale cache-line data including enclave secrets.^[15]
SGX.fail (2023): a catalogue of SGX defects.
SpecHammer (IEEE S&P 2022): combined Rowhammer bit flips with Spectre to relax the requirements for a Spectre v1 gadget, demonstrating a class of attacks that crosses microarchitectural and DRAM-level boundaries.^[35]
BadRAM (CVE-2024-21944, 2024): a $10 hardware attack against AMD SEV-SNP that modifies SPD metadata in DDR4/DDR5 DIMMs to alias memory regions, breaking the cryptographic attestation guarantees of SEV-SNP. AMD issued firmware updates to validate memory topology at boot.^[36]
WeSee (2024): an arXiv paper demonstrating malicious #VC interrupts to break SEV-SNP.

These attacks reinforce that a TEE's security is not a single binary property but a function of the specific microcode and firmware versions running on a specific physical chip, and that vendors and customers must aggressively patch and re-attest.

Trust in vendor PKIs

A TEE's attestation guarantees are only as strong as the vendor PKI rooted at Intel, AMD, ARM, or NVIDIA. A compromise of a vendor signing key or certificate authority would invalidate trust in every chip whose attestation chain depends on it. Customers cannot independently audit vendor key-management practices, which means the threat model "vs. supply-chain attacker" is only partially addressed. Apple's PCC mitigates one corner of this by publishing every production build for independent verification.^[5]

Threat model gaps

TEEs do not protect against:

Workload bugs. A vulnerable model server inside the TEE can still leak data through its API, e.g. via prompt-injection or membership inference attacks against outputs.
Side channels in shared resources. Even with encrypted memory, contention on caches, branch predictors, DRAM banks, or PCIe links can leak information across the boundary.
Physical attackers with arbitrary capability. SEV-SNP and TDX assume that the attacker cannot do full chip decapsulation or sophisticated bus probing. BadRAM showed how cheap DRAM-interposer attacks can bypass that assumption.^[36]
Statistical model leakage. Data-poisoning attacks during training, model-extraction attacks from query traffic, and training-data memorization are orthogonal to TEE protections.

Comparison to FHE, MPC, and federated learning

TEEs occupy a specific point in the privacy-preserving ML design space. The standard comparison axes are confidentiality, performance, and threat model:

Approach	Confidentiality	Performance	Threat model
TEE / confidential computing	Data and weights in plaintext only inside hardware-isolated memory	Single-digit-percent overhead for LLM inference on Blackwell	Trusts CPU / GPU silicon and vendor PKI; vulnerable to side channels
Fully homomorphic encryption (FHE)	Computation on ciphertext; server never sees plaintext	10^3 to 10^6 overhead; impractical for full LLM inference today	Trusts only the math; no silicon trust required
Secure multi-party computation (MPC)	Inputs secret-shared across non-colluding parties	10x-1000x overhead; collaborative round-trips	Trusts that a threshold of participants is honest
Federated learning	Raw data stays on device; only gradients/updates are shared	Comparable to centralized training	Does not by itself prevent leakage from shared gradients; often combined with DP

TEE-based solutions avoid the data exchange and cryptographic overhead of MPC and FHE, but require trust in the silicon vendor. Federated learning sidesteps centralization but leaks information through shared gradients unless paired with differential privacy. In practice, production deployments combine multiple techniques: e.g. Apple PCC pairs TEEs with on-device computation and differential privacy, and Anthropic's design pairs TEE attestation with strict code-review pipelines.^[5]^[6]

The wider landscape of secrecy-preserving ML overlaps with several adjacent topics that have their own dedicated articles. Homomorphic encryption for machine learning explores ciphertext-native computation as a software-only alternative to hardware TEEs. Federated learning and differential privacy occupy adjacent positions in the privacy-preserving ML stack. The threat models that TEEs aim to mitigate include model extraction attacks, model stealing generally, and membership inference attacks on outputs. Data poisoning sits orthogonal to TEEs since it targets the training pipeline rather than the runtime confidentiality boundary.

GPU-hardware context is provided by the dedicated articles on NVIDIA H100, NVIDIA H200, NVIDIA Blackwell, NVIDIA Blackwell B200, NVIDIA Hopper, and AMD Instinct MI300X. Cloud and product context is covered by Apple Intelligence (the workload behind Private Cloud Compute) and Anthropic (whose Confidential Inference design extends the Claude API).

References

Confidential Computing Consortium, "A Technical Analysis of Confidential Computing v1.3", Linux Foundation, 2022-10-01. https://confidentialcomputing.io/wp-content/uploads/sites/10/2023/03/CCC-A-Technical-Analysis-of-Confidential-Computing-v1.3_unlocked.pdf. Accessed 2026-05-20. ↩
Confidential Computing Consortium, "What Is Remote Attestation? Enhancing Data Governance with Confidential Computing", 2024-10-02. https://confidentialcomputing.io/2024/10/02/what-is-remote-attestation-enhancing-data-governance-with-confidential-computing/. Accessed 2026-05-20. ↩
NVIDIA, "Confidential Computing on H100 GPUs for Secure and Trustworthy AI", NVIDIA Technical Blog, 2023-08-08. https://developer.nvidia.com/blog/confidential-computing-on-h100-gpus-for-secure-and-trustworthy-ai/. Accessed 2026-05-20. ↩
Corvex, "Confidential Computing Meets NVIDIA HGX B200: Secure AI Without the Performance Trade-Off", 2025. https://www.corvex.ai/blog/confidential-computing-meets-nvidia-hgxtm-b200-secure-ai-without-the-performance-trade-off. Accessed 2026-05-20. ↩
Apple Security Research, "Private Cloud Compute: A new frontier for AI privacy in the cloud", 2024-06-10. https://security.apple.com/blog/private-cloud-compute/. Accessed 2026-05-20. ↩
Anthropic, "Confidential Inference via Trusted Virtual Machines", 2025-06-18. https://www.anthropic.com/research/confidential-inference-trusted-vms. Accessed 2026-05-20. ↩
Fortanix, "Fortanix Confidential AI protects proprietary model IP and data for secure AI inference in enterprise AI factories", press release, 2026-03. https://www.fortanix.com/company/pr/2026/03/fortanix-confidential-ai-protects-proprietary-model-ip-and-data-for-secure-ai-inference-in-enterprise-ai-factories. Accessed 2026-05-20. ↩
Anthropic and Pattern Labs (Irregular), "Confidential Inference Systems: Design principles and security risks", whitepaper, 2025-06. https://assets.anthropic.com/m/c52125297b85a42/original/Confidential_Inference_Paper.pdf. Accessed 2026-05-20. ↩
NVIDIA, "NVIDIA Secure AI with Blackwell and Hopper GPUs", WP-12554-001 v1.3, 2025-08. https://docs.nvidia.com/nvidia-secure-ai-with-blackwell-and-hopper-gpus-whitepaper.pdf. Accessed 2026-05-20. ↩
ARM Ltd., "Arm TrustZone Technology", Arm Architecture security features. https://www.arm.com/technologies/trustzone-for-cortex-a. Accessed 2026-05-20. ↩
Intel, "Intel Software Guard Extensions (Intel SGX) Overview", Intel Developer Zone. https://www.intel.com/content/www/us/en/products/docs/accelerator-engines/software-guard-extensions.html. Accessed 2026-05-20. ↩
AMD, "AMD SEV-SNP: Strengthening VM Isolation with Integrity Protection and More", AMD white paper, 2020-01. https://www.amd.com/content/dam/amd/en/documents/epyc-business-docs/white-papers/SEV-SNP-strengthening-vm-isolation-with-integrity-protection-and-more.pdf. Accessed 2026-05-20. ↩
Linux Foundation, "Confidential Computing Consortium Establishes Formation with Founding Members and Open Governance Structure", press release, 2019-10-17. https://www.linuxfoundation.org/press/press-release/confidential-computing-consortium-establishes-formation-with-founding-members-and-open-governance-structure-2. Accessed 2026-05-20. ↩
Van Bulck et al., "LVI: Hijacking Transient Execution through Microarchitectural Load Value Injection", IEEE Symposium on Security and Privacy, 2020-05. https://lviattack.eu/. Accessed 2026-05-20. ↩
Borrello et al., "AEPIC Leak: Architecturally Leaking Uninitialized Data from the Microarchitecture", USENIX Security 2022, 2022-08. https://aepicleak.com/. Accessed 2026-05-20. ↩
Intel, "Intel Trust Domain Extensions (Intel TDX) Overview", Intel product documentation. https://www.intel.com/content/www/us/en/products/docs/accelerator-engines/trust-domain-extensions.html. Accessed 2026-05-20. ↩
Intel Newsroom, "New 5th Gen Intel Xeon Processors Are Built with AI Acceleration in Every Core", 2023-12-14. https://newsroom.intel.com/artificial-intelligence/5th-gen-xeon-data-center-news. Accessed 2026-05-20. ↩
Intel Support, "Will Intel Software Guard Extensions (Intel SGX) Be Deprecated on Selected Processors?", Intel knowledge base. https://www.intel.com/content/www/us/en/support/articles/000089326/software/intel-security-products.html. Accessed 2026-05-20. ↩
ARM Ltd., "Arm Confidential Compute Architecture", architecture overview. https://www.arm.com/architecture/security-features/arm-confidential-compute-architecture. Accessed 2026-05-20. ↩
ARM Ltd., "Realm Management Extension (RME) System Architecture", developer learning path. https://learn.arm.com/learning-paths/cross-platform/cca_rme/cca/. Accessed 2026-05-20. ↩
Amazon Web Services, "AWS Nitro Enclaves", product page. https://aws.amazon.com/ec2/nitro/nitro-enclaves/. Accessed 2026-05-20. ↩
NVIDIA, "Confidential Compute on NVIDIA Hopper H100", whitepaper WP-11459-001. https://images.nvidia.com/aem-dam/en-zz/Solutions/data-center/HCC-Whitepaper-v1.0.pdf. Accessed 2026-05-20. ↩
AMD, "Helping Secure GPUs That Advance AI", AMD blog, 2025. https://www.amd.com/en/blogs/2025/helping-secure-gpus-that-advance-ai.html. Accessed 2026-05-20. ↩
Menetrey et al., "Attestation Mechanisms for Trusted Execution Environments Demystified", arXiv:2206.03780, 2022-06. https://arxiv.org/abs/2206.03780. Accessed 2026-05-20. ↩
AMD, "AMD SEV-SNP Attestation: Establishing Trust in Guests", developer guide. https://www.amd.com/content/dam/amd/en/documents/developer/lss-snp-attestation.pdf. Accessed 2026-05-20. ↩
Zhu et al., "Confidential Computing on NVIDIA Hopper GPUs: A Performance Benchmark Study", arXiv:2409.03992, 2024-09-06. https://arxiv.org/abs/2409.03992. Accessed 2026-05-20. ↩
Microsoft, "Azure Confidential VM guest attestation design detail", Microsoft Learn. https://learn.microsoft.com/en-us/azure/confidential-computing/guest-attestation-confidential-virtual-machines-design. Accessed 2026-05-20. ↩
Google, "Confidential Space overview", Google Cloud documentation. https://docs.cloud.google.com/confidential-computing/confidential-space/docs/confidential-space-overview. Accessed 2026-05-20. ↩
Microsoft, "Azure Confidential VM options", Microsoft Learn. https://learn.microsoft.com/en-us/azure/confidential-computing/virtual-machine-options. Accessed 2026-05-20. ↩
Google Cloud Blog, "Confidential Space reaches GA, now ready for everyone to use", 2023. https://cloud.google.com/blog/products/identity-security/confidential-space-is-ga. Accessed 2026-05-20. ↩
Amazon Web Services, "AMD SEV-SNP for Amazon EC2 instances", EC2 User Guide. https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/sev-snp.html. Accessed 2026-05-20. ↩
Anjuna, "Anjuna Seaglass: Universal Confidential Computing", product page. https://www.anjuna.io/product. Accessed 2026-05-20. ↩
Edgeless Systems, "MarbleRun: the service mesh for confidential computing", product page. https://www.edgeless.systems/products/marblerun. Accessed 2026-05-20. ↩
Confidential Containers project, "Design Overview", documentation. https://confidentialcontainers.org/docs/architecture/design-overview/. Accessed 2026-05-20. ↩
Tobah et al., "SpecHammer: Combining Spectre and Rowhammer for New Speculative Attacks", IEEE Symposium on Security and Privacy, 2022. https://andrewkwong.org/docs/oakland22-tobah.pdf. Accessed 2026-05-20. ↩
De Meulemeester et al., "BadRAM: Practical Memory Aliasing Attacks on Trusted Execution Environments", IEEE Symposium on Security and Privacy, 2025 (CVE-2024-21944). https://badram.eu/badram.pdf. Accessed 2026-05-20. ↩

Improve this article

Add missing citations, update stale details, or suggest a clearer explanation. Every suggestion is reviewed for sourcing before it goes live.

3 revisions by 1 contributor · full history

Suggest edit

What links here

Tensor Processing Unit (TPU)

Background

Origins of confidential computing

From per-process enclaves to confidential VMs

From CPU enclaves to GPU TEEs

Technical details

TEE primitives

CPU TEEs

GPU TEEs

Attestation flow

Implementations and adoption

Cloud provider offerings

ML-specific stacks

Comparative summary

Applications

Confidential inference

Model-IP protection

Confidential training and federated learning

Mitigating data-side privacy attacks

Compute provenance and AI governance

Limitations

Performance overhead

Side-channel and microarchitectural attacks

Trust in vendor PKIs

Threat model gaps

Comparison to FHE, MPC, and federated learning

Related work

See also

References

Improve this article

Related Articles

Cloud TPU

NVIDIA Picasso

Tensor Processing Unit (TPU)

TPU Pod

TPU Node

TPU Worker

What links here

Related Articles

Cloud TPU

NVIDIA Picasso

Tensor Processing Unit (TPU)

TPU Pod

TPU Node

TPU Worker