RAPIDS

Data Science NVIDIA Open Source AI

10 min read

Updated Jul 16, 2026

Suggest edit History Talk

RawGraph

Last edited

Jul 16, 2026

Fact-checked

In review queue

Sources

13 citations

Revision

v2 · 1,903 words

Fact-checks are independent of edits: a reviewer re-verifies the article against its sources and stamps the date. How we verify

RAPIDS is an open-source suite of GPU-accelerated software libraries for data science, analytics, and machine learning, developed and maintained by Nvidia. It lets data scientists run common end-to-end workflows, loading and wrangling tabular data, training classical machine-learning models, analyzing graphs, and visualizing results, entirely on graphics processing units (GPUs) rather than on the central processor, often delivering order-of-magnitude speedups on large datasets. RAPIDS is designed so that its libraries mirror the application programming interfaces (APIs) of the most widely used Python data tools, such as pandas, scikit-learn, and NetworkX, allowing practitioners to keep their existing code and habits while moving the computation to the GPU. It is built on CUDA, Nvidia's parallel-computing platform, and uses the Apache Arrow columnar memory format as a foundation for efficient in-memory data representation. ^[1]^[2]

Launch and history

Nvidia announced RAPIDS on October 10, 2018, at the GPU Technology Conference (GTC) Europe in Munich, where chief executive Jensen Huang introduced it as a platform to bring GPU acceleration to the parts of high-performance computing that had so far run only on CPUs. In the announcement Huang argued that "data analytics and machine learning are the largest segments of the high performance computing market that have not been accelerated, until now." The software was developed over roughly the preceding two years by Nvidia engineers working with open-source contributors, and it was released under the Apache 2.0 license with the code immediately available on the project site at rapids.ai. ^[1]^[2]

From the outset Nvidia positioned RAPIDS as an open-source project rather than a proprietary product, building on and contributing back to established projects including Apache Arrow, pandas, and scikit-learn. The launch was accompanied by a broad list of ecosystem and hardware partners, among them Hewlett Packard Enterprise, IBM, Oracle, Databricks, Anaconda, Cisco, Dell EMC, Lenovo, NetApp, Pure Storage, SAP, and SAS, signaling that the suite was intended to plug into existing enterprise data-science and infrastructure stacks. An early headline benchmark, training the XGBoost gradient-boosting algorithm on an Nvidia DGX-2 system, showed roughly a 50-times speedup over CPU-only systems, reducing training that took days to hours, or hours to minutes, depending on dataset size. ^[1]^[3]

Since launch, RAPIDS has followed a frequent, calendar-versioned release cadence (for example releases numbered by year and month, such as 24.02 or 25.02), steadily adding libraries, broadening hardware support, and introducing the zero-code-change accelerators described below. ^[4]

The library suite

RAPIDS is not a single library but a collection of interoperable components, most of them prefixed "cu" to denote their CUDA foundation. The core design principle is API compatibility: each GPU library is meant to behave like a familiar CPU counterpart from the PyData ecosystem, so that adopting RAPIDS does not require learning an entirely new programming model. The following table maps the principal RAPIDS libraries to their CPU-side equivalents.

RAPIDS library	Purpose	CPU / PyData equivalent
cuDF	GPU DataFrames for loading, joining, filtering, and aggregating tabular data	pandas (and Polars, Apache Spark)
cuML	Classical machine-learning algorithms (regression, clustering, trees, dimensionality reduction)	scikit-learn
cuGraph	Graph analytics on millions of nodes and edges	NetworkX
cuSpatial	Geospatial and vector GIS operations (point-in-polygon, spatial joins, distances, trajectories)	GeoPandas / shapely
cuxfilter	Cross-filtered interactive dashboards over large tabular datasets	Datashader / Bokeh dashboards
cuCIM	I/O and image-processing primitives for n-dimensional images, with a biomedical-imaging focus	scikit-image
cuVS	Vector search and clustering, including the CAGRA index	FAISS / vector-search libraries
RMM	The RAPIDS Memory Manager, for fast device-memory allocation and pooling	(CUDA memory allocators)
RAFT	Reusable algorithmic primitives shared across the other libraries	(internal building blocks)

cuDF is the cornerstone of the suite: a GPU DataFrame library that provides a pandas-like interface for reading data onto the GPU and performing operations such as filters, joins, group-bys, and column arithmetic, with its on-GPU data laid out in the Apache Arrow columnar format. cuML implements machine-learning algorithms, including linear and logistic regression, random forests, k-means clustering, k-nearest neighbors, principal component analysis (PCA), UMAP, and HDBSCAN, behind an API that matches scikit-learn. cuGraph brings graph analytics to the GPU with algorithms for centrality, community detection, traversal, and link analysis, and integrates with NetworkX and with graph neural network frameworks. cuSpatial accelerates geospatial workloads, while cuxfilter connects GPU DataFrames to charting libraries to build interactive dashboards over datasets of 100 million rows or more, and cuCIM targets image and signal processing, notably in biomedical imaging. Supporting these are lower-level components such as RMM for memory management, RAFT for shared algorithmic primitives, KvikIO for high-throughput file input/output using GPUDirect Storage, and cuVS for vector search. An older signal-processing library, cuSignal, has since been folded into the CuPy project. ^[2]^[4]^[5]

Foundations: CUDA and Apache Arrow

Two technologies underpin RAPIDS. The first is CUDA, the parallel-computing platform and programming model that exposes the thousands of cores on an Nvidia GPU; every RAPIDS library ultimately compiles its operations down to CUDA kernels, and the suite is frequently described as part of Nvidia's broader "CUDA-X" family of domain libraries. The second is the Apache Arrow columnar memory format, a language-independent standard for representing structured data in memory as contiguous columns. By adopting Arrow, RAPIDS gives its DataFrames a layout that is both efficient for the data-parallel access patterns GPUs excel at and interoperable with other Arrow-based tools, reducing the cost of moving data between libraries. Together, CUDA provides the compute and Arrow provides the common data representation that lets cuDF, cuML, cuGraph, and the rest share data on the GPU without expensive format conversions. ^[2]^[6]

Zero-code-change accelerators

A defining feature of modern RAPIDS is a set of "zero-code-change" accelerators that let existing CPU-based scripts run on the GPU without rewriting them. The flagship is cudf.pandas, a pandas accelerator mode that reached general availability in RAPIDS version 24.02, announced at GTC in March 2024. Once enabled, pandas operations execute on the GPU through cuDF when possible and fall back transparently to ordinary pandas on the CPU when an operation is unsupported, synchronizing data between the two as needed so that results are identical to running pandas alone. It is turned on either with the Jupyter notebook magic %load_ext cudf.pandas or by running a script with python -m cudf.pandas, after which unmodified import pandas code is accelerated. On a five-gigabyte workload from the DuckDB database-like operations benchmark, Nvidia reported cudf.pandas running roughly 150 times faster than pandas v2.2. ^[7]^[8]

The same pattern has been extended across the suite. cuML added a zero-code-change accelerator, cuml.accel, released as an open beta with cuML 25.02 in March 2025; enabled via %load_ext cuml.accel or python -m cuml.accel, it transparently speeds up popular scikit-learn algorithms (such as random forests, k-nearest neighbors, PCA, and k-means) as well as UMAP and HDBSCAN, falling back to CPU execution for anything unsupported. Nvidia reported representative speedups including roughly 25 times for random-forest training, about 50 times for t-SNE, up to 60 times for UMAP, and up to 175 times for HDBSCAN. cuGraph provides a zero-code-change backend for NetworkX (nx-cugraph), which became part of the headline features of the RAPIDS 24.10 release in October 2024 and lets NetworkX programs run graph algorithms on the GPU unchanged. RAPIDS also powers the Polars GPU engine: developed jointly with the Polars project and built on cuDF, it launched in open beta in September 2024 and accelerates Polars queries simply by selecting the GPU engine, with Nvidia citing up to roughly 13 times faster workflows than CPU execution. ^[9]^[10]^[11]

Accelerator	Library accelerated	How it is enabled	Status / first availability
cudf.pandas	pandas	`%load_ext cudf.pandas` or `python -m cudf.pandas`	Generally available, RAPIDS 24.02 (March 2024)
cuml.accel	scikit-learn, UMAP, HDBSCAN	`%load_ext cuml.accel` or `python -m cuml.accel`	Open beta, cuML 25.02 (March 2025)
nx-cugraph	NetworkX	set the cuGraph backend for NetworkX	RAPIDS 24.10 (October 2024)
Polars GPU engine	Polars	select the GPU engine when collecting a query	Open beta (September 2024)

Multi-GPU and multi-node scaling with Dask

A single GPU has a fixed amount of high-bandwidth memory, so RAPIDS relies on Dask, a Python library for parallel and distributed computing, to scale beyond it. The dask-cudf library extends cuDF DataFrames across many GPUs and across multiple machines, partitioning the data and coordinating computation, while dask-cuda provides utilities for launching and managing Dask worker processes on multi-GPU systems. This combination lets a RAPIDS workflow grow from a single desktop GPU to a multi-node, multi-GPU (commonly abbreviated MNMG) cluster while keeping a familiar DataFrame-style interface. cuML and cuGraph likewise offer Dask-backed, distributed implementations of many algorithms. For high-speed communication between workers, RAPIDS can use UCX-based transports over interconnects such as InfiniBand and NVLink. ^[4]^[5]

Open source, governance, and ecosystem

RAPIDS is open source under the Apache 2.0 license and is developed in the open across many repositories under the rapidsai organization on GitHub, where Nvidia engineers and external contributors file issues, submit pull requests, and discuss design. The project deliberately tracks and integrates with the wider PyData ecosystem rather than replacing it: it mirrors the pandas, scikit-learn, and NetworkX APIs, interoperates through Apache Arrow, plugs into Dask for scaling, and provides an accelerator plug-in, the RAPIDS Accelerator for Apache Spark, that offloads Spark SQL and DataFrame operations to GPUs. RAPIDS is distributed through conda and pip packages and as prebuilt Docker container images, and it is included and supported as part of NVIDIA AI Enterprise, the company's supported software platform, which adds certified configurations and enterprise support on top of the open-source libraries. ^[4]^[12]^[13]

Significance

RAPIDS represents a deliberate effort to extend GPU acceleration, long established in deep-learning training, into the broader and historically CPU-bound world of data preparation, classical machine learning, and analytics, which together make up a large share of practical data-science work. By matching the APIs of the tools data scientists already use and, increasingly, by removing the need to change any code at all, it lowers the barrier to GPU adoption for that audience. Its open-source model and tight integration with the PyData stack have made it a reference point for GPU-accelerated data science and a building block within Nvidia's wider CUDA-X and AI Enterprise software strategy, where fast, end-to-end data pipelines feed the model-training and inference systems that run on the same GPUs. ^[2]^[13]

References

NVIDIA Newsroom, "NVIDIA Introduces RAPIDS Open-Source GPU-Acceleration Platform for Large-Scale Data Analytics and Machine Learning." https://nvidianews.nvidia.com/news/nvidia-introduces-rapids-open-source-gpu-acceleration-platform-for-large-scale-data-analytics-and-machine-learning ↩
RAPIDS, "GPU Accelerated Data Science." https://rapids.ai/ ↩
NVIDIA Blog, "Getting Answers Faster: NVIDIA and Open-Source Ecosystem Come Together to Accelerate Data Science." https://blogs.nvidia.com/blog/2018/10/10/rapids-data-science-open-source-community/ ↩
RAPIDS Docs, "API Documentation." https://docs.rapids.ai/api/ ↩
NVIDIA Developer, "CUDA-X Data Science Libraries." https://developer.nvidia.com/topics/ai/data-science/cuda-x-data-science-libraries ↩
Apache Arrow, "Apache Arrow Format." https://arrow.apache.org/overview/ ↩
NVIDIA Technical Blog, "RAPIDS cuDF Accelerates pandas Nearly 150x with Zero Code Changes." https://developer.nvidia.com/blog/rapids-cudf-accelerates-pandas-nearly-150x-with-zero-code-changes/ ↩
RAPIDS Docs, "cudf.pandas." https://docs.rapids.ai/api/cudf/stable/cudf_pandas/ ↩
NVIDIA Technical Blog, "NVIDIA cuML Brings Zero Code Change Acceleration to scikit-learn." https://developer.nvidia.com/blog/nvidia-cuml-brings-zero-code-change-acceleration-to-scikit-learn/ ↩
NVIDIA Technical Blog, "NVIDIA RAPIDS 24.10 Introduces Accelerated NetworkX with Zero Code Change, Updates for UMAP and cuDF-Pandas." https://developer.nvidia.com/blog/nvidia-rapids-24-10-introduces-accelerated-networkx-with-zero-code-change-updates-for-umap-and-cudf-pandas/ ↩
NVIDIA Technical Blog, "Polars GPU Engine Powered by RAPIDS cuDF Now Available in Open Beta." https://developer.nvidia.com/blog/polars-gpu-engine-powered-by-rapids-cudf-now-available-in-open-beta/ ↩
RAPIDS, GitHub organization. https://github.com/rapidsai ↩
NVIDIA Developer, "RAPIDS (CUDA-X Data Science)." https://developer.nvidia.com/rapids ↩

Improve this article

Add missing citations, update stale details, or suggest a clearer explanation. Every suggestion is reviewed for sourcing before it goes live.

1 revision by 1 contributors · full history

Suggest edit

What links here

Kernel Support Vector Machines (KSVMs)NVIDIA DGX Cloud

Launch and history

The library suite

Foundations: CUDA and Apache Arrow

Zero-code-change accelerators

Multi-GPU and multi-node scaling with Dask

Open source, governance, and ecosystem

Significance

References

Improve this article

Related Articles

NVIDIA Dynamo

Jet-Nemotron

Nemotron 3

Megatron-LM

NVIDIA TensorRT-LLM

NVIDIA Parakeet

What links here

Related Articles

NVIDIA Dynamo

Jet-Nemotron

Nemotron 3

Megatron-LM

NVIDIA TensorRT-LLM

NVIDIA Parakeet

What links here