Apache MXNet
Last reviewed
May 2, 2026
Sources
14 citations
Review status
Source-backed
Revision
v1 ยท 2,807 words
Improve this article
Add missing citations, update stale details, or suggest a clearer explanation.
Last reviewed
May 2, 2026
Sources
14 citations
Review status
Source-backed
Revision
v1 ยท 2,807 words
Add missing citations, update stale details, or suggest a clearer explanation.
Apache MXNet (pronounced "mix-net") was an open-source deep learning framework supporting both imperative and symbolic execution, originally developed in 2015 by researchers in the DMLC (Distributed Machine Learning Community) group at CMU, NYU, MIT, and the University of Washington. The project was donated to the Apache Software Foundation in January 2017, graduated to a Top-Level Project in September 2019, and was retired to the Apache Attic on September 26, 2023, after maintainer activity declined.
From roughly 2016 through 2022, MXNet was the deep learning framework most actively promoted by Amazon Web Services, and Amazon CTO Werner Vogels announced in November 2016 that MXNet would be Amazon's "deep learning framework of choice." The framework was used internally at Amazon for product recommendations, search, and computer vision workloads, and shipped as a default option in Amazon SageMaker and the AWS Deep Learning AMIs. Despite that backing, PyTorch and TensorFlow consolidated the research and production market over the late 2010s, and contributor activity to MXNet fell sharply after 2021.
| Field | Detail |
|---|---|
| Original authors | Tianqi Chen, Mu Li, Yutian Li, Min Lin, Naiyan Wang, Minjie Wang, Tianjun Xiao, Bing Xu, Chiyuan Zhang, Zheng Zhang |
| Developer | DMLC group; Apache Software Foundation (from 2017) |
| Initial public release | September 2015 |
| First paper | December 3, 2015 (arXiv:1512.01274) |
| Apache Incubator entry | January 30, 2017 |
| Apache Top-Level Project | September 4, 2019 |
| Final stable release | 1.9.1 (May 26, 2022) |
| Retirement | September 26, 2023 (moved to Apache Attic) |
| License | Apache License 2.0 |
| Written in | C++, Python, CUDA |
| Repository | github.com/apache/mxnet (archived) |
| Website | mxnet.apache.org (archived) |
MXNet grew out of the work of the DMLC collective, an informal group of graduate students and researchers spread across several universities who collaborated on machine learning systems starting around 2014. The members included Tianqi Chen (then at the University of Washington), Mu Li (Carnegie Mellon University), Yutian Li, Min Lin, Naiyan Wang, Minjie Wang, Tianjun Xiao, Bing Xu, Chiyuan Zhang, and Zheng Zhang (then at NYU Shanghai).
Before MXNet, several DMLC members had been working on separate deep learning systems with overlapping aims:
In 2015 the authors agreed to merge the lessons from these projects into a single library that combined a Python-style imperative NDArray API with a symbolic graph compiler. The first commit to the public dmlc/mxnet GitHub repository was made in mid-2015, and a beta release appeared in September of that year.
The original technical paper, "MXNet: A Flexible and Efficient Machine Learning Library for Heterogeneous Distributed Systems," was posted to arXiv on December 3, 2015 (arXiv:1512.01274). The paper described MXNet's mixed programming interface, its dependency engine that scheduled operations across CPU and GPU devices, and its parameter server design for distributed training. It listed Tianqi Chen, Mu Li, Yutian Li, Min Lin, Naiyan Wang, Minjie Wang, Tianjun Xiao, Bing Xu, Chiyuan Zhang, and Zheng Zhang as co-authors.
The name "MXNet" stood for "mix net," a reference to the mix of imperative and declarative programming styles supported by the system.
MXNet's design centered on two complementary programming models that shared the same underlying runtime.
The NDArray API was an imperative interface similar to NumPy, with operations dispatched eagerly across CPU or GPU devices. NDArrays could be allocated on specific contexts (mx.cpu() or mx.gpu(i)), and arithmetic was executed asynchronously by an internal dependency engine.
The Symbol API, in contrast, let users construct a symbolic computation graph, bind it to data, and then optimize and execute it as a single compiled unit. Symbol was the substrate that made MXNet competitive on memory and throughput, because the executor could fuse operators, allocate buffers in advance, and exploit sublinear memory tricks during backpropagation.
A distinctive feature was that NDArray and Symbol shared the same operator registry and execution backend. A model built imperatively with NDArrays could be "hybridized" into a symbolic graph for deployment, an idea that later influenced TorchScript and torch.compile in PyTorch.
MXNet aimed for unusually broad language coverage. Official front ends existed for:
| Language | Notes |
|---|---|
| Python | Primary interface, included Gluon |
| C++ | Native API and the cpp-package higher-level wrapper |
| Scala | Used in JVM-based production systems |
| Java | Wrapper over the Scala API |
| R | Maintained for statisticians and academic users |
| Julia | MXNet.jl package |
| Perl | AI::MXNet on CPAN |
| JavaScript | Browser inference via mxnet.js |
| Clojure | org.apache.clojure-mxnet |
| Go | Community-maintained Go bindings |
The sheer count of bindings reflected the DMLC group's research-machine-learning roots, but it also fragmented documentation effort, and few of these bindings reached the maturity of the Python interface.
Distributed training was built on a key-value store abstraction called KVStore, which combined a parameter server architecture with synchronous and asynchronous gradient updates. KVStore could shard parameters across multiple servers and supported local, device, dist_sync, and dist_async modes. This design drew directly on Mu Li's earlier OSDI 2014 paper on parameter servers, and it allowed MXNet to scale to clusters with hundreds of GPUs.
MXNet was known for aggressive memory optimization. The symbolic executor performed in-place operation fusion when safe, and it implemented a sublinear memory technique for training deep networks (described by Chen, Xu, Zhang, and Guestrin in arXiv:1604.06174, April 2016) that traded extra recomputation during the backward pass for reduced peak memory. This made it possible to train networks with hundreds of layers on commodity GPUs.
MXNet supported automatic differentiation through mx.autograd for the imperative path and through the symbolic graph compiler for the declarative path. Both shared the same set of registered backward functions for operators, so a user could write training code in NumPy-like style and still benefit from optimized gradient computation.
On October 12, 2017, AWS and Microsoft jointly announced Gluon, a high-level imperative API for MXNet that resembled PyTorch in feel. Gluon let users build neural networks using object-oriented Python with eager execution by default, then call hybridize() to compile the model into a static graph for deployment. The API became the recommended entry point for new MXNet users from late 2017 onward.
A family of toolkits built on Gluon followed:
| Toolkit | Domain | First release |
|---|---|---|
| GluonCV | Computer vision (classification, detection, segmentation, pose) | 2018 |
| GluonNLP | Natural language processing (BERT, transformers, embeddings) | 2018 |
| Gluon-TS | Probabilistic time-series forecasting | 2019 |
| GluonFR | Face recognition | 2019 |
GluonCV shipped pretrained ResNet, MobileNet, YOLOv3, Mask R-CNN, and SimplePose models that were widely used in research benchmarks. Gluon-TS, developed at Amazon, included implementations of DeepAR, DeepState, and Transformer forecasters, and it later survived MXNet's retirement by adding a PyTorch backend.
On January 30, 2017, MXNet entered the Apache Incubator. The proposal listed AWS as the primary sponsor, with Carnegie Mellon University, the University of Washington, NYU Shanghai, Intel, and other groups as supporting champions. The codebase moved from dmlc/mxnet to apache/incubator-mxnet, governance shifted to an Apache-style Project Management Committee, and contributions began to flow under an Apache 2.0 Individual Contributor License Agreement.
During incubation the project added a formal release process, voting procedures for binary artifacts, and a podling community that included contributors from Amazon, Microsoft, Intel, NVIDIA, Samsung, and a long tail of academic institutions.
On September 4, 2019, the Apache Software Foundation announced that MXNet had graduated from the Incubator to a Top-Level Project. The repository moved to apache/mxnet, and the project drop the "incubating" qualifier. The graduation announcement, posted on the Apache blog, listed eight major releases produced during the incubation period and credited contributions from more than 800 developers.
Amazon was the most visible MXNet user. In November 2016, Werner Vogels published a blog post titled "MXNet: Deep Learning Framework of Choice at AWS," announcing internal adoption and laying out a multi-year investment plan. AWS shipped MXNet as a default in:
Microsoft co-announced Gluon in October 2017 and contributed integration work to bring MXNet into Azure Machine Learning. NVIDIA maintained CUDA and cuDNN integration and shipped MXNet binaries inside the NGC container catalog. Intel contributed the MKL-DNN (later renamed oneDNN) backend, which accelerated MXNet on Xeon and Xeon Phi CPUs.
Other organizations that adopted or integrated MXNet included Wolfram Research (the Mathematica neural network framework used MXNet as a backend), Cisco, Apple, Indeed, and the Vector Institute. The TuSimple autonomous trucking team published several models trained in MXNet during 2017 and 2018.
MXNet ran on a wider range of hardware than most contemporary frameworks. The maintained backends included:
| Hardware | Library | Notes |
|---|---|---|
| NVIDIA GPUs | CUDA, cuDNN, NCCL | Primary GPU target |
| AMD GPUs | ROCm, MIOpen | Community port, partial coverage |
| Intel CPUs | MKL-DNN / oneDNN | Default for x86 inference |
| ARM CPUs | NNPACK, ARM Compute Library | Mobile and edge targets |
| Apple Silicon | Accelerate framework (limited) | Pre-M1 support |
| Custom accelerators | TVM, Treelite | Through compilation rather than direct kernels |
The ARM and edge support, combined with the small size of MXNet's runtime, made it an attractive option for on-device inference between roughly 2017 and 2020. AWS used MXNet inside TVM compilation pipelines for inference at the edge through DeepLens and SageMaker Neo.
A number of well-known model implementations were released first or primarily in MXNet:
From 2018 onward, PyTorch absorbed most academic users, and TensorFlow 2.x with Keras absorbed most production users who wanted Google-cloud integration. MXNet's user base, anchored at AWS, did not grow at the same rate.
Several factors contributed to the decline:
The last stable release, version 1.9.1, was tagged on May 26, 2022. After that, commit activity in apache/mxnet slowed to a trickle, and several attempts to start a 2.x release line stalled in the release-vote stage.
On September 26, 2023, the Apache Software Foundation announced that MXNet had been retired and moved to the Apache Attic, the foundation's repository for inactive projects. The board resolution cited "insufficient maintainer activity" and the inability to assemble the quorum needed to cut new releases. The codebase remains read-only at github.com/apache/mxnet, with the mxnet.apache.org documentation now hosted under the Apache Attic site.
| Feature | Apache MXNet | PyTorch | TensorFlow |
|---|---|---|---|
| Initial release | September 2015 | September 2016 | November 2015 |
| Primary sponsor | Amazon Web Services | Meta AI (formerly FAIR) | |
| Default execution | Imperative (Gluon) and symbolic (Symbol) | Eager, with torch.compile since 2.0 | Eager (2.x), graph (1.x) |
| Multi-language bindings | Python, C++, R, Scala, Julia, Perl, Java, Clojure, JavaScript, Go | Python, C++ (limited), Java (deprecated), Mobile | Python, C++, JavaScript (TF.js), Java, Go, Swift (retired) |
| Distributed training | KVStore parameter server | DDP, FSDP, RPC | tf.distribute, MultiWorkerMirrored |
| Mobile deployment | MXNet model server, ONNX export | PyTorch Mobile, ExecuTorch | TensorFlow Lite |
| Status (2026) | Retired (Apache Attic, Sept 2023) | Active (Linux Foundation since 2022) | Active (Google) |
| License | Apache 2.0 | Modified BSD | Apache 2.0 |
Although the framework itself was retired, several MXNet-related projects continued under different stacks:
Several DMLC alumni went on to found or join companies that shaped subsequent waves of AI infrastructure. Tianqi Chen co-founded OctoML (later acquired by NVIDIA in 2024) and joined the faculty at Carnegie Mellon. Mu Li worked at Amazon on AutoGluon and later moved to Boson AI. Bing Xu and others moved to OpenAI, Anthropic, and other model labs.
The MXNet codebase is preserved in archived form at github.com/apache/mxnet, and the project's documentation remains accessible through the Apache Attic. Researchers studying the early symbolic-versus-eager debate in deep learning frameworks still cite the original 2015 paper as a clear early articulation of the hybrid approach that PyTorch and TensorFlow eventually converged on.