Kaiming He
Last reviewed
May 31, 2026
Sources
13 citations
Review status
Source-backed
Revision
v1 · 1,725 words
Improve this article
Add missing citations, update stale details, or suggest a clearer explanation.
Last reviewed
May 31, 2026
Sources
13 citations
Review status
Source-backed
Revision
v1 · 1,725 words
Add missing citations, update stale details, or suggest a clearer explanation.
Kaiming He is a Chinese computer scientist known for foundational work in computer vision and deep learning. He is most closely associated with deep residual networks, or ResNet, an architecture that made it practical to train neural networks with hundreds of layers. The ResNet paper, published in 2016, became one of the most cited scientific publications of the twenty-first century [1][2]. He is an associate professor in the Department of Electrical Engineering and Computer Science at the Massachusetts Institute of Technology, and he also holds a part-time appointment as a distinguished scientist at Google DeepMind [3][4].
Beyond ResNet, He co-authored several systems that became standard tools in vision research, including the Faster R-CNN and Mask R-CNN detectors, the RetinaNet detector with its focal loss, and the self-supervised methods MoCo and MAE. His earliest widely recognized result, a single-image dehazing method based on the dark channel prior, won a best paper award at the 2009 Conference on Computer Vision and Pattern Recognition [5]. He has received best paper awards at CVPR in 2009 and 2016 and at the International Conference on Computer Vision in 2017, along with test of time awards in 2025 [3][6].
| Field | Detail |
|---|---|
| Born | China |
| Nationality | Chinese |
| Fields | Computer vision, deep learning |
| B.S. | Tsinghua University, 2007 [3][7] |
| Ph.D. | Chinese University of Hong Kong, 2011 [3][7] |
| Doctoral advisor | Xiaoou Tang [7] |
| Microsoft Research Asia | Researcher, 2011 to 2016 [3] |
| Facebook AI Research | Research scientist, 2016 to 2024 [3] |
| MIT | Associate professor, 2024 to present [3][8] |
| Google DeepMind | Distinguished scientist, part-time, 2025 to present [4] |
| Known for | ResNet, Mask R-CNN, MoCo, MAE, dark channel prior |
| Notable awards | CVPR Best Paper (2009, 2016), ICCV Marr Prize (2017), Future Science Prize (2023) [5][6][7] |
He completed his undergraduate studies at Tsinghua University in Beijing, receiving a Bachelor of Science degree in 2007 [3][7]. He then moved to Hong Kong for doctoral study at the Chinese University of Hong Kong, where he worked in the Multimedia Laboratory under the supervision of Xiaoou Tang. He received his Ph.D. in information engineering in 2011 [7]. His doctoral research centered on image restoration, and his dissertation addressed single-image haze removal using the dark channel prior, work that had already won a best paper award at CVPR in 2009 [5][7].
He is best known as a lead author of "Deep Residual Learning for Image Recognition," first posted as a preprint in December 2015 and presented at CVPR in 2016 [1]. The paper was written with Xiangyu Zhang, Shaoqing Ren, and Jian Sun while He was a researcher at Microsoft Research Asia [1][7].
Before this work, adding more layers to a deep convolutional neural network often made training accuracy worse rather than better, a problem the authors called degradation. ResNet addressed this by reformulating each block of layers to learn a residual function relative to its input. In practice this means a residual connection, also called a skip connection, carries the input forward and is added to the block output, so the layers only need to model the difference between the two. This made very deep networks easier to optimize, and the authors trained models with up to 152 layers [1]. An ensemble of these networks reached a 3.57 percent top-five error on the ImageNet test set and won first place in the ImageNet classification task of the 2015 Large Scale Visual Recognition Challenge, along with several detection and segmentation tracks [1].
The residual connection became a standard building block across deep learning. It appears in later vision architectures and, in modified form, inside the transformer networks that underpin large language models. Analyses by the journal Nature in 2025 identified the ResNet paper as the most cited research article of the twenty-first century, with citation counts that varied by database but reached into the hundreds of thousands [2]. He and his three co-authors received the 2023 Future Science Prize in mathematics and computer science for introducing deep residual learning [7].
He contributed to a series of detectors that shaped applied vision. He was a co-author of Faster R-CNN, presented at the Neural Information Processing Systems conference in 2015 with Shaoqing Ren, Ross Girshick, and Jian Sun [9]. That work introduced a region proposal network, a small network that shares convolutional features with the detector and proposes candidate object regions at low cost, which brought object detection close to real-time speeds [9].
In 2017 He led Mask R-CNN, written with Georgia Gkioxari, Piotr Dollar, and Ross Girshick at Facebook AI Research [10]. Mask R-CNN extends Faster R-CNN by adding a branch that predicts a segmentation mask for each detected object in parallel with the existing box and class branches, which lets a single model perform instance segmentation. The method won the Marr Prize, the best paper award, at the International Conference on Computer Vision in 2017 [6][10]. The same year He co-authored "Focal Loss for Dense Object Detection," which introduced the focal loss and the RetinaNet detector to address the heavy imbalance between foreground and background examples in dense detection [11].
He later worked on self-supervised learning, where models learn from unlabeled images. He led Momentum Contrast, or MoCo, presented at CVPR in 2020 [12]. MoCo frames contrastive learning as looking up entries in a dictionary, and it maintains that dictionary as a queue updated by a slowly moving, or momentum, encoder. This let the method build a large and consistent set of negative examples, and its learned features transferred well to detection and segmentation tasks, in some cases matching or exceeding features from supervised pretraining [12].
In 2021 He led "Masked Autoencoders Are Scalable Vision Learners," which introduced the masked autoencoder, or MAE, and was presented at CVPR in 2022 [13]. The approach masks a large fraction of image patches, around 75 percent, and trains an asymmetric encoder and decoder to reconstruct the missing pixels. The encoder processes only the visible patches, which makes pretraining efficient, and the design adapted ideas from masked language modeling to images [13].
He's earliest prominent result came from his doctoral period. "Single Image Haze Removal Using Dark Channel Prior," written with Jian Sun and Xiaoou Tang, won the best paper award at CVPR in 2009 [5]. The method rests on an empirical observation the authors named the dark channel prior, that most local patches in haze-free outdoor images contain some pixels with very low intensity in at least one color channel. Using this prior with a haze imaging model, the method estimates how much haze is present and recovers a clearer image from a single photograph [5].
After completing his Ph.D. in 2011, He joined Microsoft Research Asia in Beijing as a researcher, where he stayed until 2016 [3]. Much of his early detection and recognition work, including ResNet, dates from this period [1][7]. In 2016 he moved to Facebook AI Research, the research division of the company now called Meta, where he worked as a research scientist until 2024 [3]. His detection and self-supervised learning projects, among them Mask R-CNN, MoCo, and MAE, were carried out there [10][12][13].
He joined MIT in 2024 as an associate professor in the Department of Electrical Engineering and Computer Science [3][8]. The university announced his promotion to associate professor with tenure in 2025 [3]. At MIT his listed title is the Douglas Ross (1954) Career Development Professor of Software Technology, and his research areas include artificial intelligence, machine learning, graphics, and vision [8]. In 2025 he additionally took a part-time role as a distinguished scientist at Google DeepMind while retaining his tenured MIT faculty position [4].
He has been recognized repeatedly at the field's main conferences. He received CVPR best paper awards in 2009, for the dark channel prior dehazing work, and in 2016, for deep residual learning [3][5]. He won the Marr Prize, the ICCV best paper award, in 2017 for Mask R-CNN [6]. He received the PAMI Young Researcher Award in 2018 [3]. In 2023 he shared the Future Science Prize in mathematics and computer science with Xiangyu Zhang, Shaoqing Ren, and the late Jian Sun for deep residual learning [7].
In 2025 he received two test of time awards. The Faster R-CNN paper received the NeurIPS Test of Time Award, and the paper "Delving Deep into Rectifiers," which introduced a weight initialization scheme and the parametric rectified linear unit, received the ICCV Test of Time Award [3]. According to his MIT profile, his publications had accumulated more than 700,000 citations as of May 2025, placing him among the most cited researchers in computer science [3].
He holds tenure as an associate professor in EECS at MIT and works part-time as a distinguished scientist at Google DeepMind [3][4]. His research continues to focus on visual perception and self-supervised learning within deep learning [3].