Detectron2
Last reviewed
Jun 3, 2026
Sources
12 citations
Review status
Source-backed
Revision
v1 · 1,385 words
Improve this article
Add missing citations, update stale details, or suggest a clearer explanation.
Last reviewed
Jun 3, 2026
Sources
12 citations
Review status
Source-backed
Revision
v1 · 1,385 words
Add missing citations, update stale details, or suggest a clearer explanation.
Detectron2 is an open-source software library for object detection and image segmentation, built on PyTorch and developed by Facebook AI Research (FAIR), the research group now part of Meta AI. It was released in 2019 as a ground-up rewrite of the original Detectron codebase, which had been written in Caffe2.[1][2] Detectron2 provides reference implementations of a wide range of detection and segmentation algorithms, ships a large collection of pretrained models (its "model zoo"), and is designed to be modular so that researchers and engineers can extend it for new work. It is released under the Apache 2.0 license.[3]
Detectron2 is a framework rather than a single model. It implements algorithms such as Mask R-CNN as components that can be configured, trained, and combined, but the library itself is the toolkit around those algorithms: data loading, training loops, evaluation, export, and a registry system for swapping parts in and out.
FAIR has maintained object detection software for several years. The original Detectron was open-sourced in January 2018 as a Python library powered by the Caffe2 deep learning framework, released under the Apache 2.0 license.[4][5] Detectron grew out of earlier internal detection code and implemented algorithms including Mask R-CNN, RetinaNet, Faster R-CNN, Region Proposal Network (RPN), Fast R-CNN, and R-FCN, with backbone networks such as ResNet, ResNeXt, Feature Pyramid Networks, and VGG-16.[6] A separate PyTorch project, maskrcnn-benchmark, was also released by FAIR and provided fast Mask R-CNN and Faster R-CNN training in PyTorch.
Detectron2 superseded both. The project's documentation states that it "is the successor of Detectron and maskrcnn-benchmark," and Meta describes it as "a ground-up rewrite of the codebase entirely in PyTorch to make it faster, more modular, more flexible, and easier to use in both research-first and production-oriented projects."[1][2] The move from Caffe2 to PyTorch gave the library PyTorch's eager-execution programming model, which makes models easier to inspect and modify. The original Detectron repository is now deprecated; its README directs users to Detectron2, and the repository was archived in November 2023.[6]
Detectron2 was announced by Meta AI in October 2019 in a post titled "Detectron2: A PyTorch-based modular object detection library."[1] The core authors listed in the project's citation are Yuxin Wu, Alexander Kirillov, Francisco Massa, Wan-Yen Lo, and Ross Girshick.[3] The broader Detectron effort, led by people including Ross Girshick, Kaiming He, Piotr Dollar, and Georgia Gkioxari, was later recognized with the PAMI Mark Everingham Prize for contributions to the computer vision community.[2]
Detectron2 is organized so that the pieces of a detection pipeline are decoupled and individually replaceable. Configurations are expressed in YAML files (and, in later versions, in Python config objects), so a new experiment can often be described by editing a config rather than rewriting code. Internally the library uses a registry pattern: backbones, region proposal generators, region-of-interest heads, and other components are registered by name and can be substituted, which lets users add a new module without forking the whole system.[1]
The framework targets both research and deployment. It supports training on single machines and on multiple GPUs across multiple nodes, and trained models can be exported to TorchScript or to Caffe2 format for serving.[3] This combination, a flexible research interface plus an export path to production, is one of the reasons it has been adopted beyond academic projects.
Detectron2 covers several distinct computer vision tasks, and it provides multiple algorithms within them. The implementations rely heavily on convolutional neural network backbones such as ResNet and ResNeXt, often combined with a Feature Pyramid Network.
| Task | Representative algorithms in Detectron2 |
|---|---|
| Object detection (bounding boxes) | Faster R-CNN, RetinaNet, Fast R-CNN, RPN, Cascade R-CNN |
| Instance segmentation | Mask R-CNN, PointRend, TensorMask |
| Semantic segmentation | DeepLab |
| Panoptic segmentation | Panoptic FPN, Panoptic-DeepLab |
| Keypoint / pose detection | Keypoint R-CNN |
| Dense human pose estimation | DensePose |
| Transformer-based detection | ViTDet, MViTv2 |
| Specialized detection | Rotated bounding boxes |
Meta's tool page summarizes the set as "implementations for the following object detection algorithms: Mask R-CNN, RetinaNet, Faster R-CNN, RPN, Fast R-CNN, TensorMask, PointRend, DensePose, and more."[7] Capabilities such as panoptic segmentation, DensePose, Cascade R-CNN, rotated bounding boxes, PointRend, DeepLab, ViTDet, and MViTv2 were either new in Detectron2 or added over time and were not present in the original Detectron.[3]
Several of these are maintained as separate projects inside the repository's projects/ directory rather than in the core package. As of recent versions, those projects include DensePose, PointRend, TensorMask, Panoptic-DeepLab, DeepLab, TridentNet, PointSup, Rethinking-BatchNorm, ViTDet, and MViTv2. The repository notes that these "are research projects, and therefore may not have the same level of support or stability as detectron2" itself.
Detectron2 ships a model zoo: a set of pretrained weights with matching configuration files and reported metrics, accessible from code through the detectron2.model_zoo API.[3] The baselines are organized by task and dataset. On the COCO dataset, the zoo includes object detection baselines (Faster R-CNN, RetinaNet, RPN, Fast R-CNN), instance segmentation baselines with Mask R-CNN, person keypoint detection baselines with Keypoint R-CNN, and panoptic segmentation baselines with Panoptic FPN. It also provides instance segmentation baselines on LVIS (LVIS v0.5) and additional baselines on Cityscapes and PASCAL VOC.[8]
The pretrained models span a range of backbones, including ResNet-50 and ResNet-101 (in C4, DC5, and FPN configurations), ResNeXt variants such as X-101-32x8d, and RegNet backbones, along with ablation settings that use deformable convolutions, group normalization, synchronized batch normalization, and longer training schedules.[8] Because each entry reports its accuracy and training cost, the zoo doubles as a benchmark reference. In 2021 Meta published an updated set of Mask R-CNN baselines with stronger training recipes.[9]
Meta has used Detectron2 in its own products. The company has described using it to "rapidly design and train the next-generation pose detection models that power Smart Camera, the AI camera system in Facebook's Portal video-calling devices."[2][10] The library is also widely used outside Meta: its GitHub repository reports tens of thousands of stars and is referenced as a dependency by thousands of downstream projects.[3]
A number of research systems are built directly on Detectron2. Detic, from FAIR, implements the ECCV 2022 paper "Detecting Twenty-thousand Classes using Image-level Supervision" and uses Detectron2's demo interface and components; it trains detectors with image-level labels and uses CLIP class embeddings to recognize a very large vocabulary of object classes.[11] DensePose, which estimates dense correspondences between image pixels and a 3D human body surface, is distributed as a Detectron2 project.[3] In 2021 the Mobile Vision team at Meta released D2Go, an extension for training and deploying efficient detection models on mobile and edge hardware.[2][12] Beyond Meta, community libraries such as AdelaiDet, CenterMask, and detrex are built on top of Detectron2.[3]
Detectron2 has been developed continuously since its 2019 release, with periodic tagged releases.
| Milestone | Date | Notes |
|---|---|---|
| Detectron (original) open-sourced | January 2018 | Caffe2, Apache 2.0; now deprecated[4][6] |
| Detectron2 announced and released | October 2019 | Ground-up PyTorch rewrite[1] |
| D2Go (mobile extension) released | 2021 | Efficient on-device models[2][12] |
| v0.6 tagged release | November 2021 | Latest numbered release tag[3] |
The repository continues to receive updates between numbered releases, and installation is most commonly done from source against a matching PyTorch and CUDA version.