# NVIDIA Digits

> Source: https://aiwiki.ai/wiki/nvidia_digits
> Updated: 2026-04-26
> Categories: AI Hardware, Artificial Intelligence
> From AI Wiki (https://aiwiki.ai), a free encyclopedia of artificial intelligence. Quote with attribution.

**NVIDIA Digits** (originally announced as **Project DIGITS**, later renamed **DGX Spark**) is a personal AI supercomputer developed by [NVIDIA](/wiki/nvidia) and co-designed with [MediaTek](/wiki/mediatek). Unveiled at [CES](/wiki/consumer_electronics_show) 2025 in January, the compact desktop device is built around the NVIDIA GB10 Grace Blackwell Superchip, delivering up to 1 petaflop of AI performance at FP4 precision with 128 GB of unified memory. It is designed to let AI researchers, developers, data scientists, and students prototype, fine-tune, and run [large language models](/wiki/large_language_model) with up to 200 billion parameters on a single desk-sized machine. Priced at $3,000 when first announced (later adjusted to $3,999 at retail launch), the device began shipping in October 2025 and has since received performance updates that boosted speeds up to 2.5 times over its launch configuration [1][2][3].

## Announcement and Naming

NVIDIA CEO [Jensen Huang](/wiki/jensen_huang) introduced Project DIGITS during his keynote address at CES on January 6, 2025. Huang positioned the device as a way to "put Grace Blackwell on every desk and at every AI developer's fingertips," bringing data-center-class AI compute to a form factor small enough to sit on a desktop [1].

The product was initially called **Project DIGITS** during its CES reveal. In March 2025, at NVIDIA's GTC conference, the company renamed it **DGX Spark** to align it with the broader DGX product family, which includes the DGX Station (a higher-end desktop AI workstation) and DGX Cloud (NVIDIA's cloud AI platform). Despite the name changes, the product is widely referred to by all three names in developer communities [4].

## Hardware Specifications

The core of the DGX Spark is the GB10 Grace Blackwell Superchip, a custom system-on-chip co-designed by NVIDIA and MediaTek.

### GB10 Superchip Architecture

The GB10 is a multi-die package that combines two major silicon blocks connected via NVIDIA's NVLink-C2C (chip-to-chip) interconnect over an interposer in a 2.5D integration approach:

- **Blackwell GPU**: Features 6,144 Blackwell CUDA cores and fifth-generation [Tensor Cores](/wiki/tensor_core), optimized for AI training and [inference](/wiki/inference). The GPU is a scaled-down member of the same Blackwell architecture used in NVIDIA's data center B100 and B200 accelerators. In terms of raw AI compute, the GPU's capability falls roughly between an RTX 5070 and RTX 5070 Ti.
- **Grace CPU**: A 20-core processor built on the [Arm](/wiki/arm_architecture) architecture, with 10 high-performance Cortex-X925 cores and 10 power-efficient Cortex-A725 cores. MediaTek contributed its expertise in power-efficient CPU design, memory subsystems, and high-speed interfaces to this component [5].

Both CPU and GPU dies are fabricated on TSMC's 3nm process, making the GB10 technically the most advanced Blackwell product in terms of process node. The C2C interconnect between CPU and GPU provides approximately 600 GB/s of aggregate bandwidth, ensuring that the GPU can access system memory at high speed without being bottlenecked by a PCIe bus [12].

### Memory Architecture

The GB10 features 128 GB of coherent unified system memory using LPDDR5X DRAM. Unlike discrete GPU systems where memory is split between the CPU (system DRAM) and GPU (HBM or GDDR), the GB10's unified architecture allows the entire 128 GB pool to be allocated to GPU workloads. The Blackwell GPU connects to system memory through memory controllers located in the MediaTek CPU die, with the C2C interconnect providing the high-bandwidth bridge.

This unified memory design is critical for running large AI models, which require substantial memory to hold their parameters during inference. A 70-billion-parameter model in FP16 precision requires approximately 140 GB of memory, which would exceed the capacity of any single discrete GPU but fits within the DGX Spark's unified pool [1][6].

### Full Specifications

| Component | Specification |
|---|---|
| Superchip | NVIDIA GB10 Grace Blackwell |
| GPU | Blackwell architecture, 6,144 CUDA cores, 5th-gen Tensor Cores |
| CPU | 20-core Arm (10x Cortex-X925, 10x Cortex-A725) |
| Process node | TSMC 3nm (both dies) |
| AI performance (sparse FP4) | Up to 1 PFLOP |
| AI performance (dense FP4) | ~500 TFLOPS |
| Unified memory | 128 GB LPDDR5x |
| Memory bandwidth | 273 GB/s |
| C2C interconnect bandwidth | ~600 GB/s |
| Storage | Up to 4 TB NVMe SSD |
| Networking | ConnectX-7 (for linking two units) |
| Operating system | DGX OS (based on Ubuntu 24.04 LTS) |
| Max model size (single unit) | 200 billion parameters |
| Max model size (two linked units) | 405 billion parameters |
| Form factor | Compact desktop cube |
| SoC TDP | 140W |
| System power consumption | ~200-250W estimated |
| Price (at launch) | $3,999 USD |

It is worth noting that the headline 1 PFLOP figure applies specifically to sparse NVFP4 workloads that exploit structured sparsity. Without sparsity, peak NVFP4 compute is approximately 500 TFLOPS. For FP16 workloads, performance is considerably lower, comparable to a mid-range consumer GPU [12].

### Linking Two Units

Using the built-in ConnectX-7 networking technology, two DGX Spark units can be connected together to function as a single system with 256 GB of unified memory. This configuration enables inference on models up to 405 billion parameters, which covers models like [Meta](/wiki/meta)'s [Llama](/wiki/llama) 3.1 405B [1].

## Software and Operating System

The DGX Spark ships with **DGX OS**, a lightly customized version of Ubuntu 24.04 LTS developed by NVIDIA. The operating system comes preinstalled with NVIDIA's full AI software stack.

### Preinstalled Software Stack

The following components are ready to use from first boot:

| Component | Purpose |
|---|---|
| CUDA Toolkit | GPU programming and compilation |
| cuDNN | Deep learning primitives (convolutions, attention, normalization) |
| [TensorRT](/wiki/tensorrt) | Inference optimization and engine building |
| TensorRT-LLM | LLM-specific inference optimization |
| [PyTorch](/wiki/pytorch) | Deep learning framework |
| [TensorFlow](/wiki/tensorflow) | Deep learning framework |
| JupyterLab | Interactive development environment |
| NVIDIA AI Workbench | Model development and deployment tool |
| NVIDIA Container Runtime | Docker container GPU access |
| [vLLM](/wiki/vllm) | Optimized LLM inference serving |
| NVIDIA NIM microservices | Pre-packaged AI model deployment |

The system runs Linux natively. NVIDIA Sync integration allows users to launch local IDEs like VS Code, [Cursor](/wiki/cursor), and AI Workbench directly from a web UI, even when connecting remotely [6][7].

### Container Support

The NVIDIA Container Runtime comes preinstalled and configured, enabling Docker containers to access GPU resources transparently. Developers can immediately pull and run GPU-accelerated containers from NVIDIA GPU Cloud (NGC) without additional setup. NGC provides a comprehensive registry of GPU-optimized containers, pre-trained models, and AI/ML software specifically designed for the Grace Blackwell architecture.

Docker Model Runner integration allows developers to pull and run AI models as easily as pulling container images, simplifying the workflow for experimenting with different models [13].

### DGX Dashboard

The DGX Spark includes an integrated web-based dashboard for monitoring system utilization, managing JupyterLab sessions, and configuring system settings. The dashboard provides real-time visibility into GPU utilization, memory consumption, and thermal status without requiring SSH access, making it accessible to developers who are not Linux power users.

### DGX Cloud Connection

A key part of NVIDIA's vision for the DGX Spark is its integration with **DGX Cloud**, NVIDIA's cloud-based AI development platform. The intended workflow is that developers prototype and experiment locally on their DGX Spark, then seamlessly deploy trained models to DGX Cloud or data center infrastructure for production-scale inference or further training. This local-to-cloud pipeline is designed to reduce development friction and cost, since prototyping on local hardware avoids the per-hour costs of cloud GPU instances [1].

## Benchmark Performance

The DGX Spark has been tested across a variety of AI workloads by independent reviewers and NVIDIA.

### LLM Inference Performance

The DGX Spark can load and run very large models including gpt-oss-120B and Llama 3.1 70B. According to the LMSYS Org in-depth review, the system truly shines when serving smaller models (7B-13B parameter range), especially when batching is utilized. For larger models (70B+), the 273 GB/s memory bandwidth becomes the limiting factor for token generation speed, making the device best suited for prototyping and experimentation rather than production serving at scale [12].

### Comparison with Competing Platforms

NVIDIA demonstrated several performance comparisons at launch:

- **Image generation**: 8x faster than an M4 Max MacBook Pro in [Stable Diffusion](/wiki/stable_diffusion) workloads
- **AI workloads vs AMD**: Outperformed AMD's Ryzen AI Max+ 395 across AI benchmarks (per Tom's Hardware testing)
- **Post-update performance**: After the CES 2026 software update, workloads run up to 2.5x faster than the original launch configuration

## Target Audience

NVIDIA has positioned the DGX Spark for several user groups:

- **AI researchers**: For prototyping new model architectures, running experiments, and fine-tuning models without needing data center access.
- **Software developers**: For building and testing AI-powered applications locally before deploying to production.
- **Data scientists**: For working with large datasets and running inference on models too big for consumer GPUs.
- **Students and educators**: For learning AI development on professional-grade hardware at a price point accessible to universities and well-funded individuals.

The device fills a gap between consumer-grade hardware (gaming GPUs with 16-24 GB of VRAM) and data center systems (costing tens of thousands of dollars). Before the DGX Spark, developers who needed to run models larger than roughly 30 billion parameters locally had few affordable options [8].

## Pricing and Availability

When first announced at CES 2025 in January, NVIDIA set the price at $3,000 with availability targeted for May 2025. By the time the product launched under the DGX Spark name, the retail price had been adjusted to **$3,999**. The device became available to the public on **October 15, 2025**, later than the original May target [3][5].

| Event | Date | Price |
|---|---|---|
| CES 2025 announcement (as Project DIGITS) | January 6, 2025 | $3,000 (announced) |
| GTC 2025 rename to DGX Spark | March 2025 | - |
| Public availability | October 15, 2025 | $3,999 |
| CES 2026 performance update (2.5x boost) | January 2026 | $3,999 |

The $999 price increase between announcement and launch generated some discussion in the developer community, though NVIDIA has not publicly explained the adjustment. The DGX Spark is available directly from NVIDIA and through authorized partners [3][9].

## Performance Updates

At CES 2026 in January, NVIDIA announced software and firmware updates that delivered up to **2.5x performance improvement** over the DGX Spark's launch configuration. These gains came from a combination of driver optimizations, SDK updates released in November 2025, and additional improvements announced at the event. The updates were available to all existing DGX Spark owners at no additional cost [9].

The performance improvements came primarily from:
- Optimized TensorRT engines for the Blackwell iGPU
- Improved CUDA driver scheduling for the unified memory architecture
- Better memory management for large model inference
- Updated cuDNN kernels tuned for the GB10's specific configuration

## Comparison with Apple Silicon for AI

The most common comparison point for the DGX Spark is Apple's Mac Studio and MacBook Pro with M4 Max or M5 Max chips, since both offer unified memory architectures suited to running large AI models locally.

| Feature | NVIDIA DGX Spark | Apple Mac Studio (M4 Max) | Apple MacBook Pro (M5 Max) |
|---|---|---|---|
| Architecture | Grace Blackwell (Arm CPU + Blackwell GPU) | Apple Silicon (unified CPU/GPU/NPU) | Apple Silicon (unified CPU/GPU/NPU) |
| CPU cores | 20 Arm cores | 16 cores (12P + 4E) | Up to 16 cores |
| GPU | Blackwell, 6,144 CUDA cores, 5th-gen Tensor Cores | 40-core Apple GPU | Up to 40-core Apple GPU |
| Unified memory | 128 GB LPDDR5x | Up to 128 GB | Up to 128 GB |
| Memory bandwidth | 273 GB/s | 546 GB/s | ~614 GB/s (M5 Max) |
| AI performance (FP4) | 1 PFLOP (sparse) | Not directly comparable | Not directly comparable |
| Tensor Cores | Yes (dedicated) | No (Neural Engine + GPU) | No (Neural Engine + GPU) |
| CUDA support | Yes (native) | No | No |
| Max model size | 200B params (single), 405B (dual) | Limited by memory | Limited by memory |
| OS | Linux (DGX OS / Ubuntu) | macOS | macOS |
| Price | $3,999 | From $1,999 (128 GB config ~$3,999+) | From $3,499+ |
| Form factor | Desktop cube | Desktop | Laptop |
| General-purpose use | AI-focused only (Linux) | Full desktop computer | Full laptop computer |

### Key Trade-offs

The DGX Spark excels at dedicated AI workloads thanks to its Blackwell GPU with native [CUDA](/wiki/cuda) support and fifth-generation Tensor Cores. NVIDIA demonstrated image generation workflows running 8x faster on the DGX Spark than on an M4 Max MacBook Pro. For standard AI development using [PyTorch](/wiki/pytorch), TensorFlow, or [JAX](/wiki/jax), the DGX Spark offers broader ecosystem compatibility since the vast majority of AI frameworks are optimized for CUDA first [10].

Apple Silicon machines, on the other hand, offer higher memory bandwidth (546 GB/s for M4 Max, approximately 614 GB/s for M5 Max, versus 273 GB/s for the DGX Spark). This bandwidth advantage benefits inference on quantized models, where the bottleneck is often memory throughput rather than raw compute. Apple's machines also function as general-purpose computers suitable for everyday productivity, while the DGX Spark is purpose-built for AI development [10][11].

For developers working within Apple's [MLX](/wiki/mlx) framework or Core ML ecosystem, the Mac is the natural choice. For those using standard open-source AI stacks built around CUDA, the DGX Spark provides a more complete and performant solution [10].

### Bandwidth vs. Compute Analysis

A critical nuance in the DGX Spark vs. Apple Silicon comparison is understanding which workloads are bandwidth-bound versus compute-bound:

| Workload type | Bottleneck | Winner |
|---|---|---|
| Large model inference (autoregressive) | Memory bandwidth | Apple Silicon (higher bandwidth) |
| Small model inference (batched) | Compute throughput | DGX Spark (Tensor Cores) |
| Image generation (diffusion models) | Compute throughput | DGX Spark (8x faster demonstrated) |
| Model fine-tuning | Mixed (compute + bandwidth) | DGX Spark (Tensor Cores + CUDA) |
| Quantized model inference (GGUF) | Memory bandwidth | Apple Silicon (higher bandwidth) |

For single-user LLM inference with large quantized models, Apple Silicon's higher memory bandwidth can actually produce faster token generation. But for compute-intensive tasks like training, fine-tuning, and batch inference, the DGX Spark's Tensor Cores provide a decisive advantage.

## Comparison with AMD Ryzen AI Max+ Platform

AMD offers a competing platform for local AI development through its Ryzen AI Max+ 395 processor, which integrates a large unified memory pool with GPU compute on a single chip.

| Feature | NVIDIA DGX Spark | AMD Ryzen AI Max+ 395 System |
|---|---|---|
| GPU architecture | Blackwell (Tensor Cores) | RDNA 3.5 (no Tensor Cores) |
| Unified memory | 128 GB LPDDR5x | Up to 128 GB LPDDR5x |
| Memory bandwidth | 273 GB/s | ~256 GB/s |
| AI compute framework | CUDA (native) | ROCm |
| Ecosystem compatibility | Full CUDA stack | Limited ROCm support |
| OS support | Linux only (DGX OS) | Windows and Linux |
| Price | $3,999 | Varies by system (~$2,000-$3,000) |

Tom's Hardware benchmarks found the DGX Spark outperformed the Ryzen AI Max+ 395 across AI workloads, primarily due to the Tensor Cores and the maturity of the CUDA software stack. The AMD platform's advantage lies in its lower price point, broader OS support, and the ability to function as a general-purpose computer alongside AI development [6].

## DGX Station: The Higher-End Option

Alongside the DGX Spark rename at GTC 2025, NVIDIA also announced the **DGX Station**, a more powerful desktop AI workstation. While the DGX Spark targets individual developers and researchers, the DGX Station is aimed at teams and enterprises needing more compute. The DGX Station uses a full-sized Blackwell GPU (rather than the GB10's scaled-down version) and offers significantly more memory and compute throughput, at a correspondingly higher price point [4].

## MediaTek Partnership

The GB10 Superchip represents a notable collaboration between NVIDIA and MediaTek, a Taiwanese semiconductor company best known for smartphone and IoT chipsets. MediaTek contributed its expertise in power-efficient CPU design, memory subsystem engineering, and high-speed interface design. The partnership allowed NVIDIA to produce a chip that delivers data-center-class AI capabilities within the thermal and power constraints of a desktop form factor [5].

This collaboration is significant because NVIDIA has historically designed its own CPUs (the Grace line) and GPUs independently. The MediaTek partnership for the GB10 suggests that NVIDIA sees value in leveraging external expertise for products targeting the edge and desktop markets, where power efficiency is more critical than in data centers [5].

## Developer Use Cases

The DGX Spark enables several specific development workflows that were previously impractical on desktop hardware:

- **Local LLM development**: Run and test 70B+ parameter models locally instead of paying for cloud GPU instances. Developers can iterate on prompts, fine-tuning configurations, and deployment settings without incurring per-hour cloud costs.
- **Model fine-tuning**: Fine-tune open-source models on custom datasets using [LoRA](/wiki/lora) or QLoRA techniques, with the full CUDA stack available for maximum compatibility.
- **RAG system prototyping**: Build and test retrieval-augmented generation systems locally with embedding models and vector databases running on the same machine.
- **Multi-modal AI**: Experiment with vision-language models that require both compute for image processing and memory for large language model components.
- **[Edge AI](/wiki/edge_ai) development**: Develop and optimize models intended for deployment on NVIDIA Jetson or other edge devices, using the same CUDA toolchain.

## Reception and Impact

The DGX Spark has been well received in the AI developer community, with reviewers praising its ability to run models that previously required cloud access or multi-GPU desktop setups. Tom's Hardware called the GB10 Superchip "fast and fun" and noted that it outperformed AMD's Ryzen AI Max+ 395 in AI workloads. ServeTheHome described the machine as "so freaking cool" in its review. The LMSYS Org published an in-depth review with detailed benchmarks, characterizing it as setting "a new standard for local AI inference" [6][12][14].

Criticisms have centered on the delayed launch (five months past the original May 2025 target), the $999 price increase from announcement to retail, and the fact that the device runs only Linux, which limits its appeal to users who also need macOS or Windows for other work. The 273 GB/s memory bandwidth, while adequate, is notably lower than what Apple Silicon offers, which can be a limiting factor for certain inference workloads [10].

The broader significance of the DGX Spark is its role in democratizing access to large-model AI development. Before its launch, running a 200-billion-parameter model locally required hardware costing tens of thousands of dollars. At $3,999, the DGX Spark makes this capability accessible to a much wider audience of researchers and developers, potentially accelerating the pace of AI innovation outside major corporate labs [8].

## Limitations and Known Issues

Several limitations have been identified by reviewers and users:

- **Linux only**: The DGX Spark runs DGX OS (Ubuntu-based Linux) exclusively. There is no Windows or macOS support, which limits its utility for developers who need a dual-purpose machine.
- **Memory bandwidth**: At 273 GB/s, the memory bandwidth is adequate but not exceptional. For bandwidth-bound inference workloads (especially autoregressive LLM generation), this can be the limiting factor, and Apple Silicon devices with higher bandwidth may generate tokens faster for quantized models.
- **No discrete GPU upgradeability**: The GB10 is a system-on-chip; the GPU cannot be upgraded separately. Users who need more compute must purchase a second unit or move to cloud infrastructure.
- **Limited batch inference throughput**: While the system excels at single-user inference for large models, its throughput for serving many concurrent users is limited compared to dedicated data center GPUs.
- **Sparse FP4 specificity**: The headline 1 PFLOP figure applies only to sparse NVFP4 workloads. Real-world AI workloads that use FP16 or BF16 precision will see significantly lower effective compute throughput.

## Current State

As of early 2026, the DGX Spark is shipping and has received its first major performance update (2.5x improvement at CES 2026). The device runs DGX OS based on Ubuntu 24.04 LTS and has moved to Linux kernel 6.17. NVIDIA continues to release SDK and driver updates that improve performance and expand model compatibility. The product occupies a unique position in the market as the only sub-$5,000 device capable of running 200-billion-parameter models locally with native CUDA support, making it a compelling option for serious AI developers who want to reduce their dependence on cloud computing [7][9].

## References

1. [NVIDIA Puts Grace Blackwell on Every Desk and at Every AI Developer's Fingertips - NVIDIA Newsroom](https://nvidianews.nvidia.com/news/nvidia-puts-grace-blackwell-on-every-desk-and-at-every-ai-developers-fingertips)
2. [NVIDIA Project DIGITS: Inside Grace Blackwell Supercomputing - Hyperstack](https://www.hyperstack.cloud/blog/thought-leadership/nvidia-project-digits-all-you-need-to-know-about-the-blackwell-ai-supercomputer)
3. [NVIDIA renames Project DIGITS to DGX Spark, GB10 Grace Blackwell Superchip to launch at $3999 - VideoCardz](https://videocardz.com/newz/nvidia-renames-project-digits-to-dgx-spark-mini-pc-with-gb10-grace-blackwell-superchip-to-launch-at-3999)
4. [NVIDIA Announces DGX Spark and DGX Station Personal AI Computers - NVIDIA Newsroom](https://nvidianews.nvidia.com/news/nvidia-announces-dgx-spark-and-dgx-station-personal-ai-computers)
5. [Newly-Launched NVIDIA DGX Spark Features GB10 Superchip Co-Designed by MediaTek](https://www.mediatek.com/press-room/newly-launched-nvidia-dgx-spark-features-gb10-superchip-co-designed-by-mediatek)
6. [Nvidia DGX Spark review: the GB10 Superchip powers a fast and fun AI toolbox - Tom's Hardware](https://www.tomshardware.com/pc-components/gpus/nvidia-dgx-spark-review)
7. [DGX Spark User Guide and Release Notes - NVIDIA](https://docs.nvidia.com/dgx/dgx-spark/release-notes.html)
8. [Nvidia's Project Digits is a personal AI supercomputer - TechCrunch](https://techcrunch.com/2025/01/06/nvidias-project-digits-is-a-personal-ai-computer/)
9. [NVIDIA Boosts DGX Spark Performance And Pushes New Developer Tools at CES 2026 - HotHardware](https://hothardware.com/news/nvidia-dgx-spark-performance-and-sdk-updates-ces2026)
10. [DGX Spark vs Mac Studio: Benchmarks and Alternatives - AIMultiple](https://aimultiple.com/dgx-spark-alternatives)
11. [NVIDIA DGX Spark vs. Apple M4 Max comparison - GitHub](https://gist.github.com/EngineerDogIta/4a637031722ff0120e7f901f546ea4ce)
12. [NVIDIA DGX Spark In-Depth Review: A New Standard for Local AI Inference - LMSYS Org](https://lmsys.org/blog/2025-10-13-nvidia-dgx-spark/)
13. [Docker Model Runner on the new NVIDIA DGX Spark - Docker Blog](https://www.docker.com/blog/new-nvidia-dgx-spark-docker-model-runner/)
14. [NVIDIA DGX Spark Review: The GB10 Machine is so Freaking Cool - ServeTheHome](https://www.servethehome.com/nvidia-dgx-spark-review-the-gb10-machine-is-so-freaking-cool/2/)
15. [NVIDIA Outlines GB10 SoC Architecture at Hot Chips 2025 - ServeTheHome](https://www.servethehome.com/nvidia-outlines-gb10-soc-architecture-at-hot-chips-2025/)
16. [Analysis of NVIDIA DGX Spark's GB10 SoC - Chip Log](https://www.chiplog.io/p/analysis-of-nvidia-dgx-sparks-gb10)
