NVIDIA Digits

AI Hardware Artificial Intelligence

15 min read

Updated Apr 26, 2026

Suggest edit History Talk

RawGraph

Last edited

Apr 26, 2026

Fact-checked

In review queue

Sources

16 citations

Revision

v5 · 3,644 words

Fact-checks are independent of edits: a reviewer re-verifies the article against its sources and stamps the date. How we verify

NVIDIA Digits (originally announced as Project DIGITS, later renamed DGX Spark) is a personal AI supercomputer developed by NVIDIA and co-designed with MediaTek. Unveiled at CES 2025 in January, the compact desktop device is built around the NVIDIA GB10 Grace Blackwell Superchip, delivering up to 1 petaflop of AI performance at FP4 precision with 128 GB of unified memory. It is designed to let AI researchers, developers, data scientists, and students prototype, fine-tune, and run large language models with up to 200 billion parameters on a single desk-sized machine. Priced at $3,000 when first announced (later adjusted to $3,999 at retail launch), the device began shipping in October 2025 and has since received performance updates that boosted speeds up to 2.5 times over its launch configuration ^[1]^[2]^[3].

Announcement and Naming

NVIDIA CEO Jensen Huang introduced Project DIGITS during his keynote address at CES on January 6, 2025. Huang positioned the device as a way to "put Grace Blackwell on every desk and at every AI developer's fingertips," bringing data-center-class AI compute to a form factor small enough to sit on a desktop ^[1].

The product was initially called Project DIGITS during its CES reveal. In March 2025, at NVIDIA's GTC conference, the company renamed it DGX Spark to align it with the broader DGX product family, which includes the DGX Station (a higher-end desktop AI workstation) and DGX Cloud (NVIDIA's cloud AI platform). Despite the name changes, the product is widely referred to by all three names in developer communities ^[4].

Hardware Specifications

The core of the DGX Spark is the GB10 Grace Blackwell Superchip, a custom system-on-chip co-designed by NVIDIA and MediaTek.

GB10 Superchip Architecture

The GB10 is a multi-die package that combines two major silicon blocks connected via NVIDIA's NVLink-C2C (chip-to-chip) interconnect over an interposer in a 2.5D integration approach:

Blackwell GPU: Features 6,144 Blackwell CUDA cores and fifth-generation Tensor Cores, optimized for AI training and inference. The GPU is a scaled-down member of the same Blackwell architecture used in NVIDIA's data center B100 and B200 accelerators. In terms of raw AI compute, the GPU's capability falls roughly between an RTX 5070 and RTX 5070 Ti.
Grace CPU: A 20-core processor built on the Arm architecture, with 10 high-performance Cortex-X925 cores and 10 power-efficient Cortex-A725 cores. MediaTek contributed its expertise in power-efficient CPU design, memory subsystems, and high-speed interfaces to this component ^[5].

Both CPU and GPU dies are fabricated on TSMC's 3nm process, making the GB10 technically the most advanced Blackwell product in terms of process node. The C2C interconnect between CPU and GPU provides approximately 600 GB/s of aggregate bandwidth, ensuring that the GPU can access system memory at high speed without being bottlenecked by a PCIe bus ^[12].

Memory Architecture

The GB10 features 128 GB of coherent unified system memory using LPDDR5X DRAM. Unlike discrete GPU systems where memory is split between the CPU (system DRAM) and GPU (HBM or GDDR), the GB10's unified architecture allows the entire 128 GB pool to be allocated to GPU workloads. The Blackwell GPU connects to system memory through memory controllers located in the MediaTek CPU die, with the C2C interconnect providing the high-bandwidth bridge.

This unified memory design is critical for running large AI models, which require substantial memory to hold their parameters during inference. A 70-billion-parameter model in FP16 precision requires approximately 140 GB of memory, which would exceed the capacity of any single discrete GPU but fits within the DGX Spark's unified pool ^[1]^[6].

Full Specifications

Component	Specification
Superchip	NVIDIA GB10 Grace Blackwell
GPU	Blackwell architecture, 6,144 CUDA cores, 5th-gen Tensor Cores
CPU	20-core Arm (10x Cortex-X925, 10x Cortex-A725)
Process node	TSMC 3nm (both dies)
AI performance (sparse FP4)	Up to 1 PFLOP
AI performance (dense FP4)	~500 TFLOPS
Unified memory	128 GB LPDDR5x
Memory bandwidth	273 GB/s
C2C interconnect bandwidth	~600 GB/s
Storage	Up to 4 TB NVMe SSD
Networking	ConnectX-7 (for linking two units)
Operating system	DGX OS (based on Ubuntu 24.04 LTS)
Max model size (single unit)	200 billion parameters
Max model size (two linked units)	405 billion parameters
Form factor	Compact desktop cube
SoC TDP	140W
System power consumption	~200-250W estimated
Price (at launch)	$3,999 USD

It is worth noting that the headline 1 PFLOP figure applies specifically to sparse NVFP4 workloads that exploit structured sparsity. Without sparsity, peak NVFP4 compute is approximately 500 TFLOPS. For FP16 workloads, performance is considerably lower, comparable to a mid-range consumer GPU ^[12].

Linking Two Units

Using the built-in ConnectX-7 networking technology, two DGX Spark units can be connected together to function as a single system with 256 GB of unified memory. This configuration enables inference on models up to 405 billion parameters, which covers models like Meta's Llama 3.1 405B ^[1].

Software and Operating System

The DGX Spark ships with DGX OS, a lightly customized version of Ubuntu 24.04 LTS developed by NVIDIA. The operating system comes preinstalled with NVIDIA's full AI software stack.

Preinstalled Software Stack

The following components are ready to use from first boot:

Component	Purpose
CUDA Toolkit	GPU programming and compilation
cuDNN	Deep learning primitives (convolutions, attention, normalization)
TensorRT	Inference optimization and engine building
TensorRT-LLM	LLM-specific inference optimization
PyTorch	Deep learning framework
TensorFlow	Deep learning framework
JupyterLab	Interactive development environment
NVIDIA AI Workbench	Model development and deployment tool
NVIDIA Container Runtime	Docker container GPU access
vLLM	Optimized LLM inference serving
NVIDIA NIM microservices	Pre-packaged AI model deployment

The system runs Linux natively. NVIDIA Sync integration allows users to launch local IDEs like VS Code, Cursor, and AI Workbench directly from a web UI, even when connecting remotely ^[6]^[7].

Container Support

The NVIDIA Container Runtime comes preinstalled and configured, enabling Docker containers to access GPU resources transparently. Developers can immediately pull and run GPU-accelerated containers from NVIDIA GPU Cloud (NGC) without additional setup. NGC provides a comprehensive registry of GPU-optimized containers, pre-trained models, and AI/ML software specifically designed for the Grace Blackwell architecture.

Docker Model Runner integration allows developers to pull and run AI models as easily as pulling container images, simplifying the workflow for experimenting with different models ^[13].

DGX Dashboard

The DGX Spark includes an integrated web-based dashboard for monitoring system utilization, managing JupyterLab sessions, and configuring system settings. The dashboard provides real-time visibility into GPU utilization, memory consumption, and thermal status without requiring SSH access, making it accessible to developers who are not Linux power users.

DGX Cloud Connection

A key part of NVIDIA's vision for the DGX Spark is its integration with DGX Cloud, NVIDIA's cloud-based AI development platform. The intended workflow is that developers prototype and experiment locally on their DGX Spark, then seamlessly deploy trained models to DGX Cloud or data center infrastructure for production-scale inference or further training. This local-to-cloud pipeline is designed to reduce development friction and cost, since prototyping on local hardware avoids the per-hour costs of cloud GPU instances ^[1].

Benchmark Performance

The DGX Spark has been tested across a variety of AI workloads by independent reviewers and NVIDIA.

LLM Inference Performance

The DGX Spark can load and run very large models including gpt-oss-120B and Llama 3.1 70B. According to the LMSYS Org in-depth review, the system truly shines when serving smaller models (7B-13B parameter range), especially when batching is utilized. For larger models (70B+), the 273 GB/s memory bandwidth becomes the limiting factor for token generation speed, making the device best suited for prototyping and experimentation rather than production serving at scale ^[12].

Comparison with Competing Platforms

NVIDIA demonstrated several performance comparisons at launch:

Image generation: 8x faster than an M4 Max MacBook Pro in Stable Diffusion workloads
AI workloads vs AMD: Outperformed AMD's Ryzen AI Max+ 395 across AI benchmarks (per Tom's Hardware testing)
Post-update performance: After the CES 2026 software update, workloads run up to 2.5x faster than the original launch configuration

Target Audience

NVIDIA has positioned the DGX Spark for several user groups:

AI researchers: For prototyping new model architectures, running experiments, and fine-tuning models without needing data center access.
Software developers: For building and testing AI-powered applications locally before deploying to production.
Data scientists: For working with large datasets and running inference on models too big for consumer GPUs.
Students and educators: For learning AI development on professional-grade hardware at a price point accessible to universities and well-funded individuals.

The device fills a gap between consumer-grade hardware (gaming GPUs with 16-24 GB of VRAM) and data center systems (costing tens of thousands of dollars). Before the DGX Spark, developers who needed to run models larger than roughly 30 billion parameters locally had few affordable options ^[8].

Pricing and Availability

When first announced at CES 2025 in January, NVIDIA set the price at $3,000 with availability targeted for May 2025. By the time the product launched under the DGX Spark name, the retail price had been adjusted to $3,999. The device became available to the public on October 15, 2025, later than the original May target ^[3]^[5].

Event	Date	Price
CES 2025 announcement (as Project DIGITS)	January 6, 2025	$3,000 (announced)
GTC 2025 rename to DGX Spark	March 2025	-
Public availability	October 15, 2025	$3,999
CES 2026 performance update (2.5x boost)	January 2026	$3,999

The $999 price increase between announcement and launch generated some discussion in the developer community, though NVIDIA has not publicly explained the adjustment. The DGX Spark is available directly from NVIDIA and through authorized partners ^[3]^[9].

Performance Updates

At CES 2026 in January, NVIDIA announced software and firmware updates that delivered up to 2.5x performance improvement over the DGX Spark's launch configuration. These gains came from a combination of driver optimizations, SDK updates released in November 2025, and additional improvements announced at the event. The updates were available to all existing DGX Spark owners at no additional cost ^[9].

The performance improvements came primarily from:

Optimized TensorRT engines for the Blackwell iGPU
Improved CUDA driver scheduling for the unified memory architecture
Better memory management for large model inference
Updated cuDNN kernels tuned for the GB10's specific configuration

Comparison with Apple Silicon for AI

The most common comparison point for the DGX Spark is Apple's Mac Studio and MacBook Pro with M4 Max or M5 Max chips, since both offer unified memory architectures suited to running large AI models locally.

Feature	NVIDIA DGX Spark	Apple Mac Studio (M4 Max)	Apple MacBook Pro (M5 Max)
Architecture	Grace Blackwell (Arm CPU + Blackwell GPU)	Apple Silicon (unified CPU/GPU/NPU)	Apple Silicon (unified CPU/GPU/NPU)
CPU cores	20 Arm cores	16 cores (12P + 4E)	Up to 16 cores
GPU	Blackwell, 6,144 CUDA cores, 5th-gen Tensor Cores	40-core Apple GPU	Up to 40-core Apple GPU
Unified memory	128 GB LPDDR5x	Up to 128 GB	Up to 128 GB
Memory bandwidth	273 GB/s	546 GB/s	~614 GB/s (M5 Max)
AI performance (FP4)	1 PFLOP (sparse)	Not directly comparable	Not directly comparable
Tensor Cores	Yes (dedicated)	No (Neural Engine + GPU)	No (Neural Engine + GPU)
CUDA support	Yes (native)	No	No
Max model size	200B params (single), 405B (dual)	Limited by memory	Limited by memory
OS	Linux (DGX OS / Ubuntu)	macOS	macOS
Price	$3,999	From $1,999 (128 GB config ~$3,999+)	From $3,499+
Form factor	Desktop cube	Desktop	Laptop
General-purpose use	AI-focused only (Linux)	Full desktop computer	Full laptop computer

Key Trade-offs

The DGX Spark excels at dedicated AI workloads thanks to its Blackwell GPU with native CUDA support and fifth-generation Tensor Cores. NVIDIA demonstrated image generation workflows running 8x faster on the DGX Spark than on an M4 Max MacBook Pro. For standard AI development using PyTorch, TensorFlow, or JAX, the DGX Spark offers broader ecosystem compatibility since the vast majority of AI frameworks are optimized for CUDA first ^[10].

Apple Silicon machines, on the other hand, offer higher memory bandwidth (546 GB/s for M4 Max, approximately 614 GB/s for M5 Max, versus 273 GB/s for the DGX Spark). This bandwidth advantage benefits inference on quantized models, where the bottleneck is often memory throughput rather than raw compute. Apple's machines also function as general-purpose computers suitable for everyday productivity, while the DGX Spark is purpose-built for AI development ^[10]^[11].

For developers working within Apple's MLX framework or Core ML ecosystem, the Mac is the natural choice. For those using standard open-source AI stacks built around CUDA, the DGX Spark provides a more complete and performant solution ^[10].

Bandwidth vs. Compute Analysis

A critical nuance in the DGX Spark vs. Apple Silicon comparison is understanding which workloads are bandwidth-bound versus compute-bound:

Workload type	Bottleneck	Winner
Large model inference (autoregressive)	Memory bandwidth	Apple Silicon (higher bandwidth)
Small model inference (batched)	Compute throughput	DGX Spark (Tensor Cores)
Image generation (diffusion models)	Compute throughput	DGX Spark (8x faster demonstrated)
Model fine-tuning	Mixed (compute + bandwidth)	DGX Spark (Tensor Cores + CUDA)
Quantized model inference (GGUF)	Memory bandwidth	Apple Silicon (higher bandwidth)

For single-user LLM inference with large quantized models, Apple Silicon's higher memory bandwidth can actually produce faster token generation. But for compute-intensive tasks like training, fine-tuning, and batch inference, the DGX Spark's Tensor Cores provide a decisive advantage.

Comparison with AMD Ryzen AI Max+ Platform

AMD offers a competing platform for local AI development through its Ryzen AI Max+ 395 processor, which integrates a large unified memory pool with GPU compute on a single chip.

Feature	NVIDIA DGX Spark	AMD Ryzen AI Max+ 395 System
GPU architecture	Blackwell (Tensor Cores)	RDNA 3.5 (no Tensor Cores)
Unified memory	128 GB LPDDR5x	Up to 128 GB LPDDR5x
Memory bandwidth	273 GB/s	~256 GB/s
AI compute framework	CUDA (native)	ROCm
Ecosystem compatibility	Full CUDA stack	Limited ROCm support
OS support	Linux only (DGX OS)	Windows and Linux
Price	$3,999	Varies by system (~$2,000-$3,000)

Tom's Hardware benchmarks found the DGX Spark outperformed the Ryzen AI Max+ 395 across AI workloads, primarily due to the Tensor Cores and the maturity of the CUDA software stack. The AMD platform's advantage lies in its lower price point, broader OS support, and the ability to function as a general-purpose computer alongside AI development ^[6].

DGX Station: The Higher-End Option

Alongside the DGX Spark rename at GTC 2025, NVIDIA also announced the DGX Station, a more powerful desktop AI workstation. While the DGX Spark targets individual developers and researchers, the DGX Station is aimed at teams and enterprises needing more compute. The DGX Station uses a full-sized Blackwell GPU (rather than the GB10's scaled-down version) and offers significantly more memory and compute throughput, at a correspondingly higher price point ^[4].

MediaTek Partnership

The GB10 Superchip represents a notable collaboration between NVIDIA and MediaTek, a Taiwanese semiconductor company best known for smartphone and IoT chipsets. MediaTek contributed its expertise in power-efficient CPU design, memory subsystem engineering, and high-speed interface design. The partnership allowed NVIDIA to produce a chip that delivers data-center-class AI capabilities within the thermal and power constraints of a desktop form factor ^[5].

This collaboration is significant because NVIDIA has historically designed its own CPUs (the Grace line) and GPUs independently. The MediaTek partnership for the GB10 suggests that NVIDIA sees value in leveraging external expertise for products targeting the edge and desktop markets, where power efficiency is more critical than in data centers ^[5].

Developer Use Cases

The DGX Spark enables several specific development workflows that were previously impractical on desktop hardware:

Local LLM development: Run and test 70B+ parameter models locally instead of paying for cloud GPU instances. Developers can iterate on prompts, fine-tuning configurations, and deployment settings without incurring per-hour cloud costs.
Model fine-tuning: Fine-tune open-source models on custom datasets using LoRA or QLoRA techniques, with the full CUDA stack available for maximum compatibility.
RAG system prototyping: Build and test retrieval-augmented generation systems locally with embedding models and vector databases running on the same machine.
Multi-modal AI: Experiment with vision-language models that require both compute for image processing and memory for large language model components.
Edge AI development: Develop and optimize models intended for deployment on NVIDIA Jetson or other edge devices, using the same CUDA toolchain.

Reception and Impact

The DGX Spark has been well received in the AI developer community, with reviewers praising its ability to run models that previously required cloud access or multi-GPU desktop setups. Tom's Hardware called the GB10 Superchip "fast and fun" and noted that it outperformed AMD's Ryzen AI Max+ 395 in AI workloads. ServeTheHome described the machine as "so freaking cool" in its review. The LMSYS Org published an in-depth review with detailed benchmarks, characterizing it as setting "a new standard for local AI inference" ^[6]^[12]^[14].

Criticisms have centered on the delayed launch (five months past the original May 2025 target), the $999 price increase from announcement to retail, and the fact that the device runs only Linux, which limits its appeal to users who also need macOS or Windows for other work. The 273 GB/s memory bandwidth, while adequate, is notably lower than what Apple Silicon offers, which can be a limiting factor for certain inference workloads ^[10].

The broader significance of the DGX Spark is its role in democratizing access to large-model AI development. Before its launch, running a 200-billion-parameter model locally required hardware costing tens of thousands of dollars. At $3,999, the DGX Spark makes this capability accessible to a much wider audience of researchers and developers, potentially accelerating the pace of AI innovation outside major corporate labs ^[8].

Limitations and Known Issues

Several limitations have been identified by reviewers and users:

Linux only: The DGX Spark runs DGX OS (Ubuntu-based Linux) exclusively. There is no Windows or macOS support, which limits its utility for developers who need a dual-purpose machine.
Memory bandwidth: At 273 GB/s, the memory bandwidth is adequate but not exceptional. For bandwidth-bound inference workloads (especially autoregressive LLM generation), this can be the limiting factor, and Apple Silicon devices with higher bandwidth may generate tokens faster for quantized models.
No discrete GPU upgradeability: The GB10 is a system-on-chip; the GPU cannot be upgraded separately. Users who need more compute must purchase a second unit or move to cloud infrastructure.
Limited batch inference throughput: While the system excels at single-user inference for large models, its throughput for serving many concurrent users is limited compared to dedicated data center GPUs.
Sparse FP4 specificity: The headline 1 PFLOP figure applies only to sparse NVFP4 workloads. Real-world AI workloads that use FP16 or BF16 precision will see significantly lower effective compute throughput.

Current State

As of early 2026, the DGX Spark is shipping and has received its first major performance update (2.5x improvement at CES 2026). The device runs DGX OS based on Ubuntu 24.04 LTS and has moved to Linux kernel 6.17. NVIDIA continues to release SDK and driver updates that improve performance and expand model compatibility. The product occupies a unique position in the market as the only sub-$5,000 device capable of running 200-billion-parameter models locally with native CUDA support, making it a compelling option for serious AI developers who want to reduce their dependence on cloud computing ^[7]^[9].

References

Improve this article

Add missing citations, update stale details, or suggest a clearer explanation. Every suggestion is reviewed for sourcing before it goes live.

4 revisions by 1 contributors · full history

Suggest edit

What links here

Graphics processing unit Robotics

Announcement and Naming

Hardware Specifications

GB10 Superchip Architecture

Memory Architecture

Full Specifications

Linking Two Units

Software and Operating System

Preinstalled Software Stack

Container Support

DGX Dashboard

DGX Cloud Connection

Benchmark Performance

LLM Inference Performance

Comparison with Competing Platforms

Target Audience

Pricing and Availability

Performance Updates

Comparison with Apple Silicon for AI

Key Trade-offs

Bandwidth vs. Compute Analysis

Comparison with AMD Ryzen AI Max+ Platform

DGX Station: The Higher-End Option

MediaTek Partnership

Developer Use Cases

Reception and Impact

Limitations and Known Issues

Current State

References

Improve this article

Related Articles

Nvidia

SmolVLA

Graphics processing unit

Cerebras Systems

Groq

Edge AI

What links here

Related Articles

Nvidia

SmolVLA

Graphics processing unit

Cerebras Systems

Groq

Edge AI

What links here