R (programming language)

Machine Learning Programming Languages Statistics

25 min read

Updated Jun 27, 2026

Suggest edit History Talk

RawGraph

Last edited

Jun 27, 2026

Fact-checked

In review queue

Sources

32 citations

Revision

v2 · 5,034 words

Fact-checks are independent of edits: a reviewer re-verifies the article against its sources and stamps the date. How we verify

R is a free, open-source programming language and software environment for statistical computing and graphics. It was created by Ross Ihaka and Robert Gentleman at the University of Auckland in New Zealand, who began the project as a research experiment around 1991 and posted the first public binary in August 1993 ^[1]^[27]. R is a GNU project and a free implementation of the S language developed at Bell Laboratories in the 1970s by John Chambers and colleagues; it is distributed under the GNU General Public License and maintained by the R Core Team and the R Foundation for Statistical Computing ^[2]^[4]. Version 1.0.0, the first stable release, shipped on February 29, 2000 ^[27].

The defining feature of R is not the language itself, which is small and a bit eccentric, but the package ecosystem around it. The Comprehensive R Archive Network (CRAN), founded in 1997, hosts more than 24,000 contributed packages covering most of applied statistics, plus a separate ecosystem called Bioconductor (over 2,400 packages) for genomics and bioinformatics ^[3]^[5]. The tidyverse, a set of packages built around tidy-data principles by Hadley Wickham and collaborators at Posit (formerly RStudio), has become the de-facto modern dialect for data manipulation and visualization ^[7]^[23]. R sits next to Python as one of the two dominant languages in data science. Python has clearly won the deep-learning and production-ML race, but R remains the first choice in academic statistics, biostatistics, clinical trials, official statistics, and much social-science work.

For machine learning work, R has bindings to most of the same numerical engines Python users rely on. The xgboost, lightgbm, and glmnet packages share C++ cores with their Python siblings; caret and tidymodels give a unified interface to dozens of model families; keras, tensorflow, and torch provide deep learning bindings against TensorFlow and PyTorch; and reticulate lets R and Python share data and call each other in the same session ^[13]^[15].

ELI5: explain like I'm 5

Imagine a programmable calculator. You can ask it things like "draw me a graph of how ice-cream sales change with temperature" or "is the difference between these two groups bigger than I would expect by chance?" and it gives you back a chart or a number. R is that calculator, with tens of thousands of free add-ons for almost any kind of math, especially the kind statisticians use.

R is a sibling of Python. Python is a general-purpose language that is also good at math; R is a stats language that is also good at general purpose stuff if you push it. Biologists measuring gene expression, economists forecasting inflation, and grad students fitting mixed-effects models on Tuesday usually reach for R. People training huge neural networks or shipping web services usually reach for Python. The two worlds talk to each other, and most working analysts use both.

What is R, and where did it come from?

R's lineage starts with S language, developed at Bell Labs by John Chambers, Rick Becker, Allan Wilks, and others starting around 1976. S was an interactive layer over existing Fortran statistical libraries, a way to do exploratory data analysis without recompiling a Fortran program every time you wanted a histogram. Chambers's stated aim was "to turn ideas into software, quickly and faithfully," and that line still describes the ambition of R better than any tagline R itself ever had ^[6].

S evolved through the 1980s; the third version ("new S," 1988) introduced the formula notation (y ~ x1 + x2) that survives in R today, and the fourth version (1998) brought the S4 object system. S was sold commercially as S-PLUS by StatSci (later Insightful, then TIBCO). Chambers received the 1998 ACM Software System Award and later joined the R Core Team, which is one of those rare cases where the creator of a language follows it into the open-source clone ^[6].

R's own designers were explicit that it is an S dialect built on a different engine. In their 1996 paper, Ihaka and Gentleman wrote that "in developing this new language, we sought to combine what we felt were useful features from two existing computer languages": Becker, Chambers, and Wilks's S and Steele and Sussman's Scheme ^[1]. They added that "we implemented the language by first writing an interpreter for a Scheme subset and then progressively mutating it to resemble S" ^[1]. That is why R looks like S on the surface but has Scheme-style lexical scoping and first-class functions underneath.

When was R created, and who made it?

Ross Ihaka and Robert Gentleman were junior staff in the Statistics Department at the University of Auckland in the early 1990s, both teaching introductory statistics, both unhappy with the available tools. They wanted something free and small enough to run on the Macintosh computers in the teaching lab. They began writing an interpreter inspired by S but with a Scheme-like internal model (lexical scoping, first-class functions), describing the build as roughly "two years of part-time effort" ^[1]. The first binary was posted to the StatLib server in August 1993 with an announcement on the s-news mailing list ^[27]. Ihaka and Gentleman published the design rationale in 1996 in the Journal of Computational and Graphical Statistics under the title "R: A Language for Data Analysis and Graphics" ^[1]. The name R was chosen, in their words, "in part to acknowledge the influence of S and in part to celebrate our own efforts" (the language is also a play on the authors' shared first initial) ^[1].

R became a GNU project in December 1997, the same year CRAN was set up by Kurt Hornik and Friedrich Leisch at TU Wien ^[3]^[27]. Version 1.0 was released on February 29, 2000, a leap day chosen so the developers would not have to commit to a yearly anniversary ^[27]. Notable later versions include R 2.0.0 (October 2004), R 3.0.0 (April 2013, which broke binary backward compatibility for compiled packages), and R 4.0.0 (April 2020). Minor versions ship roughly once a year. R 4.5.0 "How About a Twenty-Six" was released April 11, 2025, and the current stable release is R 4.6.0 "Because it was There," released April 24, 2026; R release names are running jokes drawn from Peanuts strips ^[27].

Who governs R?

The R Core Team, formed informally in 1997, holds commit rights to the language. As of the mid-2020s the team has roughly twenty members, including Ihaka, Gentleman, John Chambers, Brian Ripley, Kurt Hornik, Peter Dalgaard, Martin Maechler, Luke Tierney, Duncan Murdoch, and Tomas Kalibera ^[2]. The R Foundation for Statistical Computing, registered as a non-profit association under Austrian law in Vienna in April 2003, holds the copyrights and runs finances ^[4]. The R Consortium, a Linux Foundation project launched in 2015 with corporate members including Microsoft, Google, RStudio/Posit, and IBM, funds infrastructure and outreach but does not control the language ^[26].

How is R designed as a language?

R is a multi-paradigm language: procedural, functional, object-oriented, with strong vector semantics and a heavy emphasis on interactive use. Most of the surface idioms come from S, the internals borrow from Scheme, and a handful of design choices come from APL ^[1].

Vectors and recycling

The atomic unit in R is the vector. There are no scalars in the strict sense; a single number is a length-1 vector, and arithmetic is vectorized by default:

x <- c(1, 2, 3, 4)
y <- c(10, 20, 30, 40)
x + y  # returns 11 22 33 44

When vectors of different lengths interact, R recycles the shorter one to match. Most numerical work is done with vectors and matrices, and tight for loops are generally slower than vectorized expressions.

Data frames and functional features

The core abstraction for tabular data is the data frame, a named list of equal-length vectors introduced in S in the late 1980s. It is what makes R feel like a statistics environment rather than a numeric one. The tidyverse tibble and Matt Dowle's data.table are alternative implementations with different performance and syntax trade-offs. Modern R also supports Arrow-backed data frames through the arrow package, which gives zero-copy interop with Python, pandas, and Apache Spark.

Functions in R are first-class objects. Closures capture their enclosing environment (lexical scoping, inherited from Scheme), ... collects extra arguments to forward, and lazy evaluation defers argument evaluation until use ^[1]. Lazy evaluation enables some of R's most distinctive idioms, including the formula notation (lm(y ~ x, data = df) works because the formula object captures unevaluated expressions).

What object systems does R have?

R has not one but several object systems, layered on top of each other for historical reasons:

System	Year	Style	Used in
S3	1988 (S), inherited by R	Generic-function dispatch on a `class` attribute, very lightweight	Most base R, `lm`, `glm`, `print`, `summary`
S4	1998 (S), 2001 (R)	Formal multiple-dispatch generics, slot-based	Bioconductor, lme4, sp
Reference classes (R5)	2010	Mutable Java-style classes, methods bound to objects	Some packages, mostly superseded
R6	2014	Lightweight reference classes by Winston Chang	Shiny, plumber, modern object-oriented packages
S7	2023	Joint successor to S3 and S4, designed by R Core and Posit	Experimental, slowly being adopted

Most everyday R code uses S3, which is so minimal it almost feels like a convention rather than a system. S4 shows up in Bioconductor, where multiple dispatch is useful for biological data structures. R6 is the typical choice for stateful objects in modern packages ^[22].

Pipes

The magrittr package by Stefan Milton Bache (2014) introduced the %>% pipe, modelled on F#'s forward pipe. It quickly became the standard in tidyverse code: data %>% filter(x > 0) %>% mutate(y = x^2) %>% summarize(mean(y)) reads left to right rather than inside out. R Core added a native pipe |> in version 4.1.0 (May 2021), with slightly different semantics ^[27]. Most new code uses |>, but %>% is still everywhere because old code still works.

What is CRAN, and how big is the R package ecosystem?

CRAN is R's distinguishing institution, mirrored on around 90 sites worldwide. Every package goes through automated checks (the R CMD check machinery) on multiple operating systems before being accepted, and packages that break their dependencies are nagged or pulled. As of 2026, CRAN hosts more than 24,000 contributed packages ^[3].

The quality bar is uneven by design. CRAN does not vet science or statistical correctness; it vets that the package builds, passes its own tests, and does not break other packages. The long tail includes both careful, peer-reviewed work and weekend hobby projects. The tidyverse, Bioconductor, and ROpenSci sit on top as quality filters ^[5].

Major package families

Family	Maintained by	Focus
base R	R Core	Language plus core stats: `lm`, `glm`, `aov`, `t.test`, etc.
Recommended packages	R Core	Ships with R: `MASS`, `Matrix`, `survival`, `nlme`, `lattice`
tidyverse	Posit (Wickham et al.)	Data wrangling: `dplyr`, `tidyr`, `ggplot2`, `purrr`, `readr`, `tibble`, `stringr`, `forcats`
data.table	Matt Dowle, Arun Srinivasan	Fast in-memory data tables with their own DSL
Bioconductor	Bioconductor Core	Genomics, microarrays, sequencing
tidymodels	Posit (Max Kuhn et al.)	Modern ML wrappers (`parsnip`, `recipes`, `rsample`, `yardstick`)
mlr3	Bernd Bischl's group	Object-oriented ML framework
ROpenSci	ROpenSci collective	Peer-reviewed scientific packages
spatial	r-spatial team	`sf`, `terra`, `stars` for geospatial

What is Bioconductor?

Bioconductor is a parallel ecosystem to CRAN focused on bioinformatics, started in 2001 by Robert Gentleman and colleagues at the Fred Hutchinson Cancer Center ^[5]. It uses S4 heavily, has its own twice-yearly release cycle aligned with R minor versions, and hosts more than 2,400 packages for genomics, proteomics, flow cytometry, and single-cell analysis ^[5]. Papers in Nature Biotechnology that say "analysis was performed in R using DESeq2 / limma / edgeR / Seurat" are usually citing Bioconductor (Seurat is on CRAN, but the methodological neighborhood is the same).

What is the tidyverse?

The tidyverse is the most influential thing to happen to R in the last fifteen years. The name was coined by Hadley Wickham around 2016, and the meta-package tidyverse was published to CRAN on September 15, 2016 ^[23]^[28]. The idea is that data should be in tidy form (one observation per row, one variable per column, one value per cell, per Wickham's 2014 paper) and that a small grammar of verbs should be enough to manipulate it ^[7].

The core packages are:

Package	Purpose	First release
`ggplot2`	Layered grammar of graphics	2007
`dplyr`	Data manipulation verbs (`filter`, `mutate`, `summarize`, `group_by`, `arrange`, `select`)	2014
`tidyr`	Reshaping (wide-to-long, long-to-wide, pivoting, unnesting)	2014
`readr`	Fast text-file reading with type guessing	2015
`purrr`	Functional iteration (`map`, `walk`, `pmap`)	2015
`tibble`	Modern data-frame with cleaner printing and stricter rules	2014
`stringr`	String manipulation wrappers around `stringi`	2009
`forcats`	Factor (categorical) manipulation	2016

Around the core sit broom (turn model objects into tidy tibbles), lubridate (dates and times), rvest (web scraping), httr2 (HTTP), dbplyr (translate dplyr to SQL), and many others. The book that taught a generation of working data scientists tidyverse-style R is Wickham and Grolemund's R for Data Science, free online; the second edition (2023) is co-authored with Mine Cetinkaya-Rundel ^[8]^[9].

What is ggplot2?

Wickham wrote ggplot2 as part of his 2008 PhD dissertation at Iowa State, building on Leland Wilkinson's 1999 book The Grammar of Graphics, and first released it on CRAN in June 2007 ^[10]^[11]^[29]. A plot is built up from layers (data, geometric mark, statistical transform, scale, coordinate system, facet) rather than chosen from a menu of plot types. A faceted scatter plot with a per-group smoother is a few lines:

library(ggplot2)
ggplot(mpg, aes(displ, hwy, color = class)) +
  geom_point() +
  geom_smooth(method = "loess") +
  facet_wrap(~ year)

The aesthetic is so distinctive that ggplot's gray-grid panels became almost a marker of "a chart from a data scientist" in 2010s journalism and academic papers. Extension packages include patchwork for composition, gganimate for animation, ggrepel for label placement, and ggdist for uncertainty viz.

What are RStudio and Posit?

RStudio the integrated development environment was first released in February 2011 by RStudio Inc., a company founded by JJ Allaire (creator of ColdFusion and Open Live Writer) ^[30]. The IDE was an immediate hit. R had previously been used through plain editors plus a console, and RStudio bundled a code editor, console, plot pane, environment browser, package manager, and Sweave/knitr integration into one window. The free edition is open-source under the AGPL; commercial editions add team features ^[30].

In July 2022 RStudio Inc. renamed itself to Posit PBC (a registered Public Benefit Corporation), to signal that the company was no longer R-only and was investing seriously in Python ^[23]. Posit now ships Posit Workbench (multi-language IDE), Posit Connect (publishing platform), and Posit Cloud (hosted environments), and it maintains the tidyverse, the keras and tensorflow R bindings, the torch R port, reticulate, Quarto, Shiny, and many other open-source projects ^[23]^[24]^[25].

What tools does R provide for reports and web apps (R Markdown, Quarto, Shiny)?

Reproducible reporting in R has gone through several generations. Friedrich Leisch's Sweave (2002) wove R code into LaTeX. Yihui Xie's knitr (2011) generalized it to Markdown, HTML, and other formats. R Markdown, built on knitr and Pandoc by JJ Allaire, Yihui Xie, and the RStudio team starting in 2014, made literate programming the default for analyses, papers, books, and slides. R for Data Science and Forecasting: Principles and Practice are both written in R Markdown; the bookdown extension by Xie made book-length projects practical ^[8]^[20].

Quarto, released by Posit in 2022, is the next-generation rewrite. It is language-agnostic by design (R, Python, Julia, Observable JavaScript) and runs as a separate command-line tool rather than an R package ^[24]. The same .qmd file can render to HTML, PDF, Word, websites, books, slides, or dashboards. Quarto is one of the clearest artifacts of Posit's broader-than-R turn; scientific Python users have adopted it alongside Jupyter.

Shiny, released in 2012 by Joe Cheng at RStudio, is R's web application framework ^[25]. A Shiny app is an R script that defines a UI (HTML widgets) and a server function (reactive R code), with the framework keeping them in sync over a websocket. An analyst can build an interactive dashboard or modeling tool in R alone, no JavaScript required. Shiny is used heavily in pharma, finance, government, and academia. A companion Shiny for Python launched in 2022 ^[25].

What statistical and machine-learning packages does R have?

R's classical statistics coverage is essentially exhaustive. Linear and generalized linear models (lm, glm in base), mixed-effects models (lme4 by Doug Bates, Martin Maechler, Ben Bolker), survival analysis (survival by Terry Therneau), generalized additive models (mgcv by Simon Wood), Bayesian inference (rstan, brms, cmdstanr, rjags), and thousands more in econometrics, psychometrics, and ecology ^[2]. If a paper proposes a statistical method, there is a good chance one of the authors put a working R package on CRAN.

Classical ML packages

For machine learning specifically, R has long had wrappers around the standard algorithm families:

Package	Author / year	Method
`randomForest`	Andy Liaw & Matthew Wiener (2002), wrapping Breiman & Cutler's Fortran code	Random forest
`ranger`	Marvin N. Wright (2015)	Faster random forests, especially for high dimensions
`xgboost`	Tianqi Chen et al. (2014)	XGBoost gradient boosting, R binding
`lightgbm`	Microsoft (2017)	LightGBM, R binding
`glmnet`	Friedman, Hastie, Tibshirani (2010)	Lasso and elastic-net regularized regression
`e1071`	David Meyer et al.	LIBSVM bindings, naive Bayes, k-means
`kernlab`	Alexandros Karatzoglou et al.	Kernel methods, SVM, kernel PCA
`nnet`	Brian Ripley	Single-hidden-layer neural networks (in base for decades)
`mboost`	Hothorn et al.	Model-based boosting
`MASS`	Venables & Ripley	LDA, QDA, polynomial regression, classic methods

The caret package (Classification And REgression Training) was built by Max Kuhn starting in 2007 to wrap the dozens of inconsistent ML packages behind a single API: a unified train() function that handles preprocessing, resampling, hyperparameter tuning, and prediction across hundreds of models ^[12]. For about a decade, caret was how most R users did ML.

Tidymodels and mlr3

Kuhn rewrote the framework with the same philosophy but tidyverse-native idioms; the result is tidymodels, a meta-package and collection of components (parsnip for models, recipes for preprocessing, rsample for resampling, yardstick for metrics, tune for hyperparameter search, workflowsets for combining them) released to CRAN in 2020 and now the default for new code ^[13]. mlr3, by Bernd Bischl's group at LMU Munich, is the third generation of the mlr framework and a more object-oriented alternative to tidymodels ^[14]. It uses R6 classes throughout, has good support for benchmarking and pipelines, and is favored in some research labs for experiment management.

Can you do deep learning in R?

R is not where most deep learning research happens. The community of paper authors writing in R is small relative to the Python/PyTorch crowd, and the papers-with-code culture grew up around Python notebooks. But R has good bindings for the major frameworks.

The tensorflow R package binds the TensorFlow C++ library through reticulate. The companion keras package wraps the Keras high-level API. Both were released by RStudio starting in 2017, with JJ Allaire as lead author. Francois Chollet co-wrote Deep Learning with R (2018, second edition 2022) with Allaire and Tomasz Kalinowski, an R port of his Deep Learning with Python ^[15].

The torch R package, written by Daniel Falbel and the mlverse team and first released to CRAN in October 2020, is a different beast ^[16]. It does not bind PyTorch through Python; it links directly against libtorch, the C++ library underlying PyTorch. That means torch for R has no Python dependency, runs in a single process, and gets near-native speed. The API mirrors PyTorch closely (autograd, nn.Module-equivalent, datasets and dataloaders), and the mlverse organization on GitHub maintains extensions including torchvision, torchaudio, tabnet, and luz (a high-level training loop akin to PyTorch Lightning).

R's deep-learning footprint is biggest in research areas where R was already entrenched: epidemiological modeling, ecology, psychometrics, single-cell genomics. Most production deep-learning systems still run in Python, but a researcher who needs to fit a CNN as part of a larger statistical analysis can stay inside R if they want to.

How does R interoperate with Python?

The reticulate package, released by RStudio in 2018, embeds a Python interpreter inside an R session and translates objects between the two ^[13]. You can call Python functions from R, source .py files, use NumPy arrays and pandas DataFrames as if they were R vectors and data frames, and run a Python REPL inside the R console. Reticulate is what makes the R-side Keras and TensorFlow bindings work, and it is what makes mixed-language Quarto documents practical. The other direction, calling R from Python, is handled by rpy2 (Laurent Gautier, 2008). Apache Arrow's R and Python bindings share an in-memory format, so large data frames move between languages with no serialization cost. In practice many data teams write data pipelines and ML training in Python and reach for R for the final modeling and reporting steps.

R vs Python: which is better for data science?

The rivalry between R and Python has cooled. Both communities mostly accept that they overlap heavily and complement each other for the rest. Neither is universally "better": R is typically stronger for classical statistics, static visualization, and bioinformatics, while Python dominates deep learning, production engineering, and general-purpose programming. A rough division of labor:

Task	R is typically stronger	Python is typically stronger
Classical statistics (mixed models, survival, GAMs, Bayesian inference)	Yes, by a wide margin	Catching up via PyMC and statsmodels but still behind
Data wrangling	Tidyverse, `data.table`, `dplyr` are excellent	pandas is excellent; Polars is excellent
Static plotting	ggplot2 is widely considered best in class	matplotlib, plotly, seaborn are all good
Interactive dashboards	Shiny	Streamlit, Dash, Gradio, Shiny for Python
Reproducible reports	R Markdown, Quarto	Jupyter, Quarto
ML pipelines	tidymodels, mlr3	scikit-learn
Deep learning	torch, keras (R bindings to the same engines)	PyTorch, TensorFlow, JAX (the actual research happens here)
LLM tooling	Some, mainly through `ellmer` and Ollama bindings	Vast: LangChain, LlamaIndex, transformers, vLLM, etc.
Bioinformatics / genomics	Bioconductor is the standard	Biopython exists but Bioconductor is bigger
Production web services	Plumber, Shiny	FastAPI, Django, Flask
General-purpose programming	Possible but awkward	Designed for it

Most serious data teams now use both. There is no Python equivalent of Bioconductor for genomics, and no R equivalent of the PyTorch / Hugging Face ecosystem for deep learning. Pick the language that matches where the rest of the work in your subfield already lives.

What is R used for?

R is the working language of a sizable chunk of academic and applied statistics. Specific strongholds: pharma and clinical trials (the FDA accepts R for regulatory submissions, and frameworks like the R Validation Hub are part of the toolchain); official statistics (Eurostat, the U.S. Bureau of Labor Statistics, Statistics Canada, Statistics NZ); genomics and biostatistics (Bioconductor and single-cell packages like Seurat are the default); econometrics and quantitative finance; ecology and geospatial analysis through vegan, sf, and terra; data journalism at outlets like the BBC, FiveThirtyEight, and the Financial Times; and most public NBA/NFL analytics work ^[5].

The textbook An Introduction to Statistical Learning by James, Witten, Hastie, and Tibshirani (2013, second edition 2021) was originally written with R examples and has become one of the most widely used ML textbooks in undergraduate statistics ^[19]. A companion Python edition appeared in 2023, which is itself a small data point about the shifting balance.

Is R open source? Versioning and licensing

R is free software released under the GNU General Public License version 2 or 3, at the user's option ^[2]. CRAN does not require GPL for contributed packages; common alternatives include MIT, Apache 2.0, and BSD.

Major version timeline:

Version	Release	Notes
R 0.16	1995	First public version
R 0.49	April 1997	First on CRAN
R 0.60	December 1997	Became a GNU project
R 1.0.0	February 29, 2000	First stable release
R 2.0.0	October 4, 2004	Lazy-loading package data
R 3.0.0	April 3, 2013	Long vectors; broke binary compatibility
R 3.4.0	April 2017	JIT compilation by default
R 3.5.0	April 2018	ALTREP framework
R 4.0.0	April 24, 2020	`stringsAsFactors = FALSE` default
R 4.1.0	May 2021	Native pipe `
R 4.2.0	April 2022	Native pipe placeholder; UTF-8 on Windows
R 4.3.0	April 2023	`_` placeholder in pipes
R 4.4.0	April 2024	Language-level enhancements
R 4.5.0	April 11, 2025	"How About a Twenty-Six"
R 4.6.0	April 24, 2026	"Because it was There"; current stable

What are R's limitations and criticisms?

R is productive for what it was designed to do, but it has rough edges that practitioners have lived with for decades.

Memory model. Until R 3.5, reference counting was naive and many operations made silent copies. R 4.0 improved this, but the language still loves making copies, and large in-memory datasets are easier in data.table or Arrow than in base R.
Speed. Vectorized R is comparable to NumPy. Loop-heavy R is slow. The standard remedy is to call C++ via the Rcpp package by Dirk Eddelbuettel and Romain Francois (2008), which has its own thriving ecosystem ^[21].
Surface inconsistency. Different packages disagree about NA handling, factor defaults, base graphics versus ggplot2, and a dozen other small things. The tidyverse cleaned up a lot of this, but old code still mixes idioms.
Production deployment. Putting an R model behind a web service is harder than in Python. Plumber and Shiny work, but the broader microservice infrastructure (containers, observability, schedulers) is more Python-shaped.
General-purpose programming. R is not a great fit for non-statistical work. Writing a parser, a game, or a queueing system in R is possible but unusual.
Concurrency. R's threading is limited; the parallel, future, and mirai packages provide multi-process backends but there is nothing like Python's asyncio built into the language.

Most R users would say none of these are dealbreakers for the work R is actually used for, and the ones that matter (Rcpp for speed, Arrow for memory, Plumber for serving) have working answers.

References

Ihaka, R., & Gentleman, R. (1996). R: A language for data analysis and graphics. *Journal of Computational and Graphical Statistics*, 5(3), 299-314. ↩
R Core Team (2026). *R: A Language and Environment for Statistical Computing*. R Foundation for Statistical Computing, Vienna, Austria. https://www.r-project.org/ ↩
Comprehensive R Archive Network (CRAN). https://cran.r-project.org/ ↩
R Foundation for Statistical Computing. https://www.r-project.org/foundation/ ↩
Bioconductor Project. https://www.bioconductor.org/ ↩
Chambers, J. M. (2008). *Software for Data Analysis: Programming with R*. Springer. ↩
Wickham, H. (2014). Tidy data. *Journal of Statistical Software*, 59(10), 1-23. ↩
Wickham, H., & Grolemund, G. (2017). *R for Data Science*. O'Reilly Media. https://r4ds.had.co.nz/ ↩
Wickham, H., Cetinkaya-Rundel, M., & Grolemund, G. (2023). *R for Data Science* (2nd ed.). O'Reilly. https://r4ds.hadley.nz/ ↩
Wickham, H. (2016). *ggplot2: Elegant Graphics for Data Analysis* (2nd ed.). Springer. ↩
Wilkinson, L. (1999). *The Grammar of Graphics*. Springer. ↩
Kuhn, M. (2008). Building predictive models in R using the caret package. *Journal of Statistical Software*, 28(5), 1-26. ↩
Kuhn, M., & Silge, J. (2022). *Tidy Modeling with R*. O'Reilly. https://www.tmwr.org/ ↩
Bischl, B., Sonabend, R., Kotthoff, L., & Lang, M. (eds.) (2024). *Applied Machine Learning Using mlr3 in R*. Chapman & Hall/CRC. https://mlr3book.mlr-org.com/ ↩
Allaire, J. J., & Chollet, F. (2022). *Deep Learning with R* (2nd ed.). Manning. ↩
Falbel, D. (2020). torch for R. https://torch.mlverse.org/ ↩
Chen, T., & Guestrin, C. (2016). XGBoost: A scalable tree boosting system. *Proceedings of KDD 2016*.
Friedman, J., Hastie, T., & Tibshirani, R. (2010). Regularization paths for generalized linear models via coordinate descent. *Journal of Statistical Software*, 33(1), 1-22.
James, G., Witten, D., Hastie, T., & Tibshirani, R. (2021). *An Introduction to Statistical Learning with Applications in R* (2nd ed.). Springer. https://www.statlearning.com/ ↩
Hyndman, R. J., & Athanasopoulos, G. (2021). *Forecasting: Principles and Practice* (3rd ed.). OTexts. https://otexts.com/fpp3/ ↩
Eddelbuettel, D., & Francois, R. (2011). Rcpp: Seamless R and C++ integration. *Journal of Statistical Software*, 40(8), 1-18. ↩
Chang, W. (2014). R6: Encapsulated classes with reference semantics. https://r6.r-lib.org/ ↩
Posit PBC (formerly RStudio). https://posit.co/ ↩
Quarto. https://quarto.org/ ↩
Shiny. https://shiny.posit.co/ ↩
R Consortium. https://www.r-consortium.org/ ↩
Wikipedia: R (programming language). https://en.wikipedia.org/wiki/R_(programming_language) ↩
Wikipedia: Tidyverse. https://en.wikipedia.org/wiki/Tidyverse ↩
Wikipedia: ggplot2. https://en.wikipedia.org/wiki/Ggplot2 ↩
Wikipedia: RStudio. https://en.wikipedia.org/wiki/RStudio ↩
Wikipedia: Bioconductor. https://en.wikipedia.org/wiki/Bioconductor
Wikipedia: S (programming language). https://en.wikipedia.org/wiki/S_(programming_language)

Improve this article

Add missing citations, update stale details, or suggest a clearer explanation. Every suggestion is reviewed for sourcing before it goes live.

1 revision by 1 contributors · full history

Suggest edit

What links here

Data Science Iris dataset Julia (programming language)Linear Discriminant Analysis Singular value decomposition t-SNE

ELI5: explain like I'm 5

What is R, and where did it come from?

What is the S language, and how is R related to it?

When was R created, and who made it?

Who governs R?

How is R designed as a language?

Vectors and recycling

Data frames and functional features

What object systems does R have?

Pipes

What is CRAN, and how big is the R package ecosystem?

Major package families

What is Bioconductor?

What is the tidyverse?

What is ggplot2?

What are RStudio and Posit?

What tools does R provide for reports and web apps (R Markdown, Quarto, Shiny)?

What statistical and machine-learning packages does R have?

Classical ML packages

Tidymodels and mlr3

Can you do deep learning in R?

How does R interoperate with Python?

R vs Python: which is better for data science?

What is R used for?

Is R open source? Versioning and licensing

What are R's limitations and criticisms?

See also

References

Improve this article

Related Articles

tf.keras

Gradio

Dask

Julia (programming language)

A/B Testing

Generalized Linear Model

What links here

Related Articles

tf.keras

Gradio

Dask

Julia (programming language)

A/B Testing

Generalized Linear Model

What links here