NVIDIA Digits (originally announced as Project DIGITS, later renamed DGX Spark) is a personal AI supercomputer developed by NVIDIA and co-designed with MediaTek. Unveiled at CES 2025 in January, the compact desktop device is built around the NVIDIA GB10 Grace Blackwell Superchip, delivering up to 1 petaflop of AI performance at FP4 precision with 128 GB of unified memory. It is designed to let AI researchers, developers, data scientists, and students prototype, fine-tune, and run large language models with up to 200 billion parameters on a single desk-sized machine. Priced at $3,000 when first announced (later adjusted to $3,999 at retail launch), the device began shipping in October 2025 and has since received performance updates that boosted speeds up to 2.5 times over its launch configuration [1][2][3].
NVIDIA CEO Jensen Huang introduced Project DIGITS during his keynote address at CES on January 6, 2025. Huang positioned the device as a way to "put Grace Blackwell on every desk and at every AI developer's fingertips," bringing data-center-class AI compute to a form factor small enough to sit on a desktop [1].
The product was initially called Project DIGITS during its CES reveal. In March 2025, at NVIDIA's GTC conference, the company renamed it DGX Spark to align it with the broader DGX product family, which includes the DGX Station (a higher-end desktop AI workstation) and DGX Cloud (NVIDIA's cloud AI platform). Despite the name changes, the product is widely referred to by all three names in developer communities [4].
The core of the DGX Spark is the GB10 Grace Blackwell Superchip, a custom system-on-chip co-designed by NVIDIA and MediaTek.
The GB10 is a multi-die package that combines two major silicon blocks connected via NVIDIA's NVLink-C2C (chip-to-chip) interconnect over an interposer in a 2.5D integration approach:
Both CPU and GPU dies are fabricated on TSMC's 3nm process, making the GB10 technically the most advanced Blackwell product in terms of process node. The C2C interconnect between CPU and GPU provides approximately 600 GB/s of aggregate bandwidth, ensuring that the GPU can access system memory at high speed without being bottlenecked by a PCIe bus [12].
The GB10 features 128 GB of coherent unified system memory using LPDDR5X DRAM. Unlike discrete GPU systems where memory is split between the CPU (system DRAM) and GPU (HBM or GDDR), the GB10's unified architecture allows the entire 128 GB pool to be allocated to GPU workloads. The Blackwell GPU connects to system memory through memory controllers located in the MediaTek CPU die, with the C2C interconnect providing the high-bandwidth bridge.
This unified memory design is critical for running large AI models, which require substantial memory to hold their parameters during inference. A 70-billion-parameter model in FP16 precision requires approximately 140 GB of memory, which would exceed the capacity of any single discrete GPU but fits within the DGX Spark's unified pool [1][6].
| Component | Specification |
|---|---|
| Superchip | NVIDIA GB10 Grace Blackwell |
| GPU | Blackwell architecture, 6,144 CUDA cores, 5th-gen Tensor Cores |
| CPU | 20-core Arm (10x Cortex-X925, 10x Cortex-A725) |
| Process node | TSMC 3nm (both dies) |
| AI performance (sparse FP4) | Up to 1 PFLOP |
| AI performance (dense FP4) | ~500 TFLOPS |
| Unified memory | 128 GB LPDDR5x |
| Memory bandwidth | 273 GB/s |
| C2C interconnect bandwidth | ~600 GB/s |
| Storage | Up to 4 TB NVMe SSD |
| Networking | ConnectX-7 (for linking two units) |
| Operating system | DGX OS (based on Ubuntu 24.04 LTS) |
| Max model size (single unit) | 200 billion parameters |
| Max model size (two linked units) | 405 billion parameters |
| Form factor | Compact desktop cube |
| SoC TDP | 140W |
| System power consumption | ~200-250W estimated |
| Price (at launch) | $3,999 USD |
It is worth noting that the headline 1 PFLOP figure applies specifically to sparse NVFP4 workloads that exploit structured sparsity. Without sparsity, peak NVFP4 compute is approximately 500 TFLOPS. For FP16 workloads, performance is considerably lower, comparable to a mid-range consumer GPU [12].
Using the built-in ConnectX-7 networking technology, two DGX Spark units can be connected together to function as a single system with 256 GB of unified memory. This configuration enables inference on models up to 405 billion parameters, which covers models like Meta's Llama 3.1 405B [1].
The DGX Spark ships with DGX OS, a lightly customized version of Ubuntu 24.04 LTS developed by NVIDIA. The operating system comes preinstalled with NVIDIA's full AI software stack.
The following components are ready to use from first boot:
| Component | Purpose |
|---|---|
| CUDA Toolkit | GPU programming and compilation |
| cuDNN | Deep learning primitives (convolutions, attention, normalization) |
| TensorRT | Inference optimization and engine building |
| TensorRT-LLM | LLM-specific inference optimization |
| PyTorch | Deep learning framework |
| TensorFlow | Deep learning framework |
| JupyterLab | Interactive development environment |
| NVIDIA AI Workbench | Model development and deployment tool |
| NVIDIA Container Runtime | Docker container GPU access |
| vLLM | Optimized LLM inference serving |
| NVIDIA NIM microservices | Pre-packaged AI model deployment |
The system runs Linux natively. NVIDIA Sync integration allows users to launch local IDEs like VS Code, Cursor, and AI Workbench directly from a web UI, even when connecting remotely [6][7].
The NVIDIA Container Runtime comes preinstalled and configured, enabling Docker containers to access GPU resources transparently. Developers can immediately pull and run GPU-accelerated containers from NVIDIA GPU Cloud (NGC) without additional setup. NGC provides a comprehensive registry of GPU-optimized containers, pre-trained models, and AI/ML software specifically designed for the Grace Blackwell architecture.
Docker Model Runner integration allows developers to pull and run AI models as easily as pulling container images, simplifying the workflow for experimenting with different models [13].
The DGX Spark includes an integrated web-based dashboard for monitoring system utilization, managing JupyterLab sessions, and configuring system settings. The dashboard provides real-time visibility into GPU utilization, memory consumption, and thermal status without requiring SSH access, making it accessible to developers who are not Linux power users.
A key part of NVIDIA's vision for the DGX Spark is its integration with DGX Cloud, NVIDIA's cloud-based AI development platform. The intended workflow is that developers prototype and experiment locally on their DGX Spark, then seamlessly deploy trained models to DGX Cloud or data center infrastructure for production-scale inference or further training. This local-to-cloud pipeline is designed to reduce development friction and cost, since prototyping on local hardware avoids the per-hour costs of cloud GPU instances [1].
The DGX Spark has been tested across a variety of AI workloads by independent reviewers and NVIDIA.
The DGX Spark can load and run very large models including gpt-oss-120B and Llama 3.1 70B. According to the LMSYS Org in-depth review, the system truly shines when serving smaller models (7B-13B parameter range), especially when batching is utilized. For larger models (70B+), the 273 GB/s memory bandwidth becomes the limiting factor for token generation speed, making the device best suited for prototyping and experimentation rather than production serving at scale [12].
NVIDIA demonstrated several performance comparisons at launch:
NVIDIA has positioned the DGX Spark for several user groups:
The device fills a gap between consumer-grade hardware (gaming GPUs with 16-24 GB of VRAM) and data center systems (costing tens of thousands of dollars). Before the DGX Spark, developers who needed to run models larger than roughly 30 billion parameters locally had few affordable options [8].
When first announced at CES 2025 in January, NVIDIA set the price at $3,000 with availability targeted for May 2025. By the time the product launched under the DGX Spark name, the retail price had been adjusted to $3,999. The device became available to the public on October 15, 2025, later than the original May target [3][5].
| Event | Date | Price |
|---|---|---|
| CES 2025 announcement (as Project DIGITS) | January 6, 2025 | $3,000 (announced) |
| GTC 2025 rename to DGX Spark | March 2025 | - |
| Public availability | October 15, 2025 | $3,999 |
| CES 2026 performance update (2.5x boost) | January 2026 | $3,999 |
The $999 price increase between announcement and launch generated some discussion in the developer community, though NVIDIA has not publicly explained the adjustment. The DGX Spark is available directly from NVIDIA and through authorized partners [3][9].
At CES 2026 in January, NVIDIA announced software and firmware updates that delivered up to 2.5x performance improvement over the DGX Spark's launch configuration. These gains came from a combination of driver optimizations, SDK updates released in November 2025, and additional improvements announced at the event. The updates were available to all existing DGX Spark owners at no additional cost [9].
The performance improvements came primarily from:
The most common comparison point for the DGX Spark is Apple's Mac Studio and MacBook Pro with M4 Max or M5 Max chips, since both offer unified memory architectures suited to running large AI models locally.
| Feature | NVIDIA DGX Spark | Apple Mac Studio (M4 Max) | Apple MacBook Pro (M5 Max) |
|---|---|---|---|
| Architecture | Grace Blackwell (Arm CPU + Blackwell GPU) | Apple Silicon (unified CPU/GPU/NPU) | Apple Silicon (unified CPU/GPU/NPU) |
| CPU cores | 20 Arm cores | 16 cores (12P + 4E) | Up to 16 cores |
| GPU | Blackwell, 6,144 CUDA cores, 5th-gen Tensor Cores | 40-core Apple GPU | Up to 40-core Apple GPU |
| Unified memory | 128 GB LPDDR5x | Up to 128 GB | Up to 128 GB |
| Memory bandwidth | 273 GB/s | 546 GB/s | ~614 GB/s (M5 Max) |
| AI performance (FP4) | 1 PFLOP (sparse) | Not directly comparable | Not directly comparable |
| Tensor Cores | Yes (dedicated) | No (Neural Engine + GPU) | No (Neural Engine + GPU) |
| CUDA support | Yes (native) | No | No |
| Max model size | 200B params (single), 405B (dual) | Limited by memory | Limited by memory |
| OS | Linux (DGX OS / Ubuntu) | macOS | macOS |
| Price | $3,999 | From $1,999 (128 GB config ~$3,999+) | From $3,499+ |
| Form factor | Desktop cube | Desktop | Laptop |
| General-purpose use | AI-focused only (Linux) | Full desktop computer | Full laptop computer |
The DGX Spark excels at dedicated AI workloads thanks to its Blackwell GPU with native CUDA support and fifth-generation Tensor Cores. NVIDIA demonstrated image generation workflows running 8x faster on the DGX Spark than on an M4 Max MacBook Pro. For standard AI development using PyTorch, TensorFlow, or JAX, the DGX Spark offers broader ecosystem compatibility since the vast majority of AI frameworks are optimized for CUDA first [10].
Apple Silicon machines, on the other hand, offer higher memory bandwidth (546 GB/s for M4 Max, approximately 614 GB/s for M5 Max, versus 273 GB/s for the DGX Spark). This bandwidth advantage benefits inference on quantized models, where the bottleneck is often memory throughput rather than raw compute. Apple's machines also function as general-purpose computers suitable for everyday productivity, while the DGX Spark is purpose-built for AI development [10][11].
For developers working within Apple's MLX framework or Core ML ecosystem, the Mac is the natural choice. For those using standard open-source AI stacks built around CUDA, the DGX Spark provides a more complete and performant solution [10].
A critical nuance in the DGX Spark vs. Apple Silicon comparison is understanding which workloads are bandwidth-bound versus compute-bound:
| Workload type | Bottleneck | Winner |
|---|---|---|
| Large model inference (autoregressive) | Memory bandwidth | Apple Silicon (higher bandwidth) |
| Small model inference (batched) | Compute throughput | DGX Spark (Tensor Cores) |
| Image generation (diffusion models) | Compute throughput | DGX Spark (8x faster demonstrated) |
| Model fine-tuning | Mixed (compute + bandwidth) | DGX Spark (Tensor Cores + CUDA) |
| Quantized model inference (GGUF) | Memory bandwidth | Apple Silicon (higher bandwidth) |
For single-user LLM inference with large quantized models, Apple Silicon's higher memory bandwidth can actually produce faster token generation. But for compute-intensive tasks like training, fine-tuning, and batch inference, the DGX Spark's Tensor Cores provide a decisive advantage.
AMD offers a competing platform for local AI development through its Ryzen AI Max+ 395 processor, which integrates a large unified memory pool with GPU compute on a single chip.
| Feature | NVIDIA DGX Spark | AMD Ryzen AI Max+ 395 System |
|---|---|---|
| GPU architecture | Blackwell (Tensor Cores) | RDNA 3.5 (no Tensor Cores) |
| Unified memory | 128 GB LPDDR5x | Up to 128 GB LPDDR5x |
| Memory bandwidth | 273 GB/s | ~256 GB/s |
| AI compute framework | CUDA (native) | ROCm |
| Ecosystem compatibility | Full CUDA stack | Limited ROCm support |
| OS support | Linux only (DGX OS) | Windows and Linux |
| Price | $3,999 | Varies by system (~$2,000-$3,000) |
Tom's Hardware benchmarks found the DGX Spark outperformed the Ryzen AI Max+ 395 across AI workloads, primarily due to the Tensor Cores and the maturity of the CUDA software stack. The AMD platform's advantage lies in its lower price point, broader OS support, and the ability to function as a general-purpose computer alongside AI development [6].
Alongside the DGX Spark rename at GTC 2025, NVIDIA also announced the DGX Station, a more powerful desktop AI workstation. While the DGX Spark targets individual developers and researchers, the DGX Station is aimed at teams and enterprises needing more compute. The DGX Station uses a full-sized Blackwell GPU (rather than the GB10's scaled-down version) and offers significantly more memory and compute throughput, at a correspondingly higher price point [4].
The GB10 Superchip represents a notable collaboration between NVIDIA and MediaTek, a Taiwanese semiconductor company best known for smartphone and IoT chipsets. MediaTek contributed its expertise in power-efficient CPU design, memory subsystem engineering, and high-speed interface design. The partnership allowed NVIDIA to produce a chip that delivers data-center-class AI capabilities within the thermal and power constraints of a desktop form factor [5].
This collaboration is significant because NVIDIA has historically designed its own CPUs (the Grace line) and GPUs independently. The MediaTek partnership for the GB10 suggests that NVIDIA sees value in leveraging external expertise for products targeting the edge and desktop markets, where power efficiency is more critical than in data centers [5].
The DGX Spark enables several specific development workflows that were previously impractical on desktop hardware:
The DGX Spark has been well received in the AI developer community, with reviewers praising its ability to run models that previously required cloud access or multi-GPU desktop setups. Tom's Hardware called the GB10 Superchip "fast and fun" and noted that it outperformed AMD's Ryzen AI Max+ 395 in AI workloads. ServeTheHome described the machine as "so freaking cool" in its review. The LMSYS Org published an in-depth review with detailed benchmarks, characterizing it as setting "a new standard for local AI inference" [6][12][14].
Criticisms have centered on the delayed launch (five months past the original May 2025 target), the $999 price increase from announcement to retail, and the fact that the device runs only Linux, which limits its appeal to users who also need macOS or Windows for other work. The 273 GB/s memory bandwidth, while adequate, is notably lower than what Apple Silicon offers, which can be a limiting factor for certain inference workloads [10].
The broader significance of the DGX Spark is its role in democratizing access to large-model AI development. Before its launch, running a 200-billion-parameter model locally required hardware costing tens of thousands of dollars. At $3,999, the DGX Spark makes this capability accessible to a much wider audience of researchers and developers, potentially accelerating the pace of AI innovation outside major corporate labs [8].
Several limitations have been identified by reviewers and users:
As of early 2026, the DGX Spark is shipping and has received its first major performance update (2.5x improvement at CES 2026). The device runs DGX OS based on Ubuntu 24.04 LTS and has moved to Linux kernel 6.17. NVIDIA continues to release SDK and driver updates that improve performance and expand model compatibility. The product occupies a unique position in the market as the only sub-$5,000 device capable of running 200-billion-parameter models locally with native CUDA support, making it a compelling option for serious AI developers who want to reduce their dependence on cloud computing [7][9].