Detectron2

Computer Vision Meta AI Open Source AI

7 min read

Updated Jul 16, 2026

Suggest edit History Talk

RawGraph

Last edited

Jul 16, 2026

Fact-checked

In review queue

Sources

12 citations

Revision

v2 · 1,383 words

Fact-checks are independent of edits: a reviewer re-verifies the article against its sources and stamps the date. How we verify

Detectron2 is an open-source software library for object detection and image segmentation, built on PyTorch and developed by Facebook AI Research (FAIR), the research group now part of Meta AI. It was released in 2019 as a ground-up rewrite of the original Detectron codebase, which had been written in Caffe2.^[1]^[2] Detectron2 provides reference implementations of a wide range of detection and segmentation algorithms, ships a large collection of pretrained models (its "model zoo"), and is designed to be modular so that researchers and engineers can extend it for new work. It is released under the Apache 2.0 license.^[3]

Detectron2 is a framework rather than a single model. It implements algorithms such as Mask R-CNN as components that can be configured, trained, and combined, but the library itself is the toolkit around those algorithms: data loading, training loops, evaluation, export, and a registry system for swapping parts in and out.

Background and predecessors

FAIR has maintained object detection software for several years. The original Detectron was open-sourced in January 2018 as a Python library powered by the Caffe2 deep learning framework, released under the Apache 2.0 license.^[4]^[5] Detectron grew out of earlier internal detection code and implemented algorithms including Mask R-CNN, RetinaNet, Faster R-CNN, Region Proposal Network (RPN), Fast R-CNN, and R-FCN, with backbone networks such as ResNet, ResNeXt, Feature Pyramid Networks, and VGG-16.^[6] A separate PyTorch project, maskrcnn-benchmark, was also released by FAIR and provided fast Mask R-CNN and Faster R-CNN training in PyTorch.

Detectron2 superseded both. The project's documentation states that it "is the successor of Detectron and maskrcnn-benchmark," and Meta describes it as "a ground-up rewrite of the codebase entirely in PyTorch to make it faster, more modular, more flexible, and easier to use in both research-first and production-oriented projects."^[1]^[2] The move from Caffe2 to PyTorch gave the library PyTorch's eager-execution programming model, which makes models easier to inspect and modify. The original Detectron repository is now deprecated; its README directs users to Detectron2, and the repository was archived in November 2023.^[6]

Detectron2 was announced by Meta AI in October 2019 in a post titled "Detectron2: A PyTorch-based modular object detection library."^[1] The core authors listed in the project's citation are Yuxin Wu, Alexander Kirillov, Francisco Massa, Wan-Yen Lo, and Ross Girshick.^[3] The broader Detectron effort, led by people including Ross Girshick, Kaiming He, Piotr Dollar, and Georgia Gkioxari, was later recognized with the PAMI Mark Everingham Prize for contributions to the computer vision community.^[2]

Design

Detectron2 is organized so that the pieces of a detection pipeline are decoupled and individually replaceable. Configurations are expressed in YAML files (and, in later versions, in Python config objects), so a new experiment can often be described by editing a config rather than rewriting code. Internally the library uses a registry pattern: backbones, region proposal generators, region-of-interest heads, and other components are registered by name and can be substituted, which lets users add a new module without forking the whole system.^[1]

The framework targets both research and deployment. It supports training on single machines and on multiple GPUs across multiple nodes, and trained models can be exported to TorchScript or to Caffe2 format for serving.^[3] This combination, a flexible research interface plus an export path to production, is one of the reasons it has been adopted beyond academic projects.

Supported tasks and algorithms

Detectron2 covers several distinct computer vision tasks, and it provides multiple algorithms within them. The implementations rely heavily on convolutional neural network backbones such as ResNet and ResNeXt, often combined with a Feature Pyramid Network.

Task	Representative algorithms in Detectron2
Object detection (bounding boxes)	Faster R-CNN, RetinaNet, Fast R-CNN, RPN, Cascade R-CNN
Instance segmentation	Mask R-CNN, PointRend, TensorMask
Semantic segmentation	DeepLab
Panoptic segmentation	Panoptic FPN, Panoptic-DeepLab
Keypoint / pose detection	Keypoint R-CNN
Dense human pose estimation	DensePose
Transformer-based detection	ViTDet, MViTv2
Specialized detection	Rotated bounding boxes

Meta's tool page summarizes the set as "implementations for the following object detection algorithms: Mask R-CNN, RetinaNet, Faster R-CNN, RPN, Fast R-CNN, TensorMask, PointRend, DensePose, and more."^[7] Capabilities such as panoptic segmentation, DensePose, Cascade R-CNN, rotated bounding boxes, PointRend, DeepLab, ViTDet, and MViTv2 were either new in Detectron2 or added over time and were not present in the original Detectron.^[3]

Several of these are maintained as separate projects inside the repository's projects/ directory rather than in the core package. As of recent versions, those projects include DensePose, PointRend, TensorMask, Panoptic-DeepLab, DeepLab, TridentNet, PointSup, Rethinking-BatchNorm, ViTDet, and MViTv2. The repository notes that these "are research projects, and therefore may not have the same level of support or stability as detectron2" itself.

Model zoo

Detectron2 ships a model zoo: a set of pretrained weights with matching configuration files and reported metrics, accessible from code through the detectron2.model_zoo API.^[3] The baselines are organized by task and dataset. On the COCO dataset, the zoo includes object detection baselines (Faster R-CNN, RetinaNet, RPN, Fast R-CNN), instance segmentation baselines with Mask R-CNN, person keypoint detection baselines with Keypoint R-CNN, and panoptic segmentation baselines with Panoptic FPN. It also provides instance segmentation baselines on LVIS (LVIS v0.5) and additional baselines on Cityscapes and PASCAL VOC.^[8]

The pretrained models span a range of backbones, including ResNet-50 and ResNet-101 (in C4, DC5, and FPN configurations), ResNeXt variants such as X-101-32x8d, and RegNet backbones, along with ablation settings that use deformable convolutions, group normalization, synchronized batch normalization, and longer training schedules.^[8] Because each entry reports its accuracy and training cost, the zoo doubles as a benchmark reference. In 2021 Meta published an updated set of Mask R-CNN baselines with stronger training recipes.^[9]

Adoption and derivatives

Meta has used Detectron2 in its own products. The company has described using it to "rapidly design and train the next-generation pose detection models that power Smart Camera, the AI camera system in Facebook's Portal video-calling devices."^[2]^[10] The library is also widely used outside Meta: its GitHub repository reports tens of thousands of stars and is referenced as a dependency by thousands of downstream projects.^[3]

A number of research systems are built directly on Detectron2. Detic, from FAIR, implements the ECCV 2022 paper "Detecting Twenty-thousand Classes using Image-level Supervision" and uses Detectron2's demo interface and components; it trains detectors with image-level labels and uses CLIP class embeddings to recognize a very large vocabulary of object classes.^[11] DensePose, which estimates dense correspondences between image pixels and a 3D human body surface, is distributed as a Detectron2 project.^[3] In 2021 the Mobile Vision team at Meta released D2Go, an extension for training and deploying efficient detection models on mobile and edge hardware.^[2]^[12] Beyond Meta, community libraries such as AdelaiDet, CenterMask, and detrex are built on top of Detectron2.^[3]

Version history

Detectron2 has been developed continuously since its 2019 release, with periodic tagged releases.

Milestone	Date	Notes
Detectron (original) open-sourced	January 2018	Caffe2, Apache 2.0; now deprecated^[4]^[6]
Detectron2 announced and released	October 2019	Ground-up PyTorch rewrite^[1]
D2Go (mobile extension) released	2021	Efficient on-device models^[2]^[12]
v0.6 tagged release	November 2021	Latest numbered release tag^[3]

The repository continues to receive updates between numbered releases, and installation is most commonly done from source against a matching PyTorch and CUDA version.

References

Meta AI, "Detectron2: A PyTorch-based modular object detection library." https://ai.meta.com/blog/-detectron2-a-pytorch-based-modular-object-detection-library-/ ↩
Meta AI, "Detectron Q&A: The origins, evolution, and future of our pioneering computer vision library." https://ai.meta.com/blog/detectron-everingham-prize/ ↩
facebookresearch/detectron2, GitHub repository (README, license, model zoo API, citation, projects). https://github.com/facebookresearch/detectron2 ↩
Facebook Research, "Facebook open sources Detectron" (January 2018). https://research.fb.com/blog/2018/01/facebook-open-sources-detectron/ ↩
InfoQ, "Facebook Releases Open Source 'Detectron' Deep-Learning Library for Object Detection" (2018). https://www.infoq.com/news/2018/03/facebook-detectron-library/ ↩
facebookresearch/Detectron, GitHub repository (deprecation notice, Caffe2, Apache 2.0, algorithm list). https://github.com/facebookresearch/Detectron ↩
Meta AI, "Detectron2" tool page. https://ai.meta.com/tools/detectron2/ ↩
facebookresearch/detectron2, "Detectron2 Model Zoo and Baselines" (MODEL_ZOO.md). https://github.com/facebookresearch/detectron2/blob/main/MODEL_ZOO.md ↩
Meta AI, "Advancing computer vision research with new Detectron2 Mask R-CNN baselines." https://ai.meta.com/blog/advancing-computer-vision-research-with-new-detectron2-mask-r-cnn-baselines/ ↩
InfoQ, "Facebook AI Releases New Computer Vision Library Detectron2" (October 2019). https://www.infoq.com/news/2019/10/facebook-ai-detectron2/ ↩
facebookresearch/Detic, "Detecting Twenty-thousand Classes using Image-level Supervision" (ECCV 2022). https://github.com/facebookresearch/Detic ↩
Meta AI, "D2Go brings Detectron2 to mobile." https://ai.meta.com/blog/d2go-brings-detectron2-to-mobile/ ↩

Improve this article

Add missing citations, update stale details, or suggest a clearer explanation. Every suggestion is reviewed for sourcing before it goes live.

1 revision by 1 contributors · full history

Suggest edit

What links here

COCO dataset

Background and predecessors

Design

Supported tasks and algorithms

Model zoo

Adoption and derivatives

Version history

References

Improve this article

Related Articles

Segment Anything Model and Dataset (SAM and SA-1B)

DINO (computer vision)

DINOv2

DINOv3

SAM 2

I-JEPA