Soumith Chintala
Last reviewed
Jun 5, 2026
Sources
No citations yet
Review status
Needs citations
Revision
v2 ยท 2,031 words
Improve this article
Add missing citations, update stale details, or suggest a clearer explanation.
Last reviewed
Jun 5, 2026
Sources
No citations yet
Review status
Needs citations
Revision
v2 ยท 2,031 words
Add missing citations, update stale details, or suggest a clearer explanation.
Soumith Chintala is an Indian-American artificial intelligence researcher and engineer best known as the co-creator and long-time lead of PyTorch, the open-source deep learning framework that underpins a large share of modern AI research and production systems.[1][2] He spent roughly eleven years at Facebook AI Research (FAIR), later Meta AI, where he built and led PyTorch for nearly eight years, and he is also a co-author of several influential papers on generative adversarial networks, including the deep convolutional GAN (DCGAN).[2][3][4] In 2025 he left Meta to join Thinking Machines Lab, the AI startup founded by former OpenAI chief technology officer Mira Murati, where he was named chief technology officer.[5][6][7]
Chintala's career sits at the intersection of deep learning research and the engineering of the tools that make it practical. His most cited and widely used work is PyTorch, a framework he helped start in 2016 and then guided from an experimental successor to the Torch library into one of the dominant systems for training and deploying neural networks.[2][8] Before and alongside PyTorch, he contributed to early research on generative models, co-authoring the DCGAN, Wasserstein GAN, and Laplacian pyramid GAN (LAPGAN) papers, and he maintained benchmarking and library infrastructure used across the field.[4][9] His Google Scholar profile lists more than 160,000 citations, reflecting both the reach of PyTorch and his role on heavily cited generative-model papers.[9]
Chintala grew up in Hyderabad, India, and as of the mid-2020s lives in New York City.[1][10] He attended Hyderabad Public School and then studied at the Vellore Institute of Technology (VIT) in Vellore, India, which press coverage has described as a tier-2 engineering college, completing his undergraduate degree there.[10][15] He has recounted that he applied to roughly a dozen United States universities for graduate study and was rejected by all of them, then moved to the United States on a J-1 visa before gaining admission to New York University (NYU).[10] His LinkedIn profile and reporting indicate he began a master's program in computer science at NYU around 2010, with research focused on deep learning and computer vision.[10][11][16]
At NYU he worked in the orbit of Yann LeCun, the deep learning pioneer who later became Meta's chief AI scientist.[10][11] Chintala has said his direct mentor at NYU was Pierre Sermanet, then a PhD student, and that LeCun gave him crucial opportunities to continue in AI; together with Sermanet and LeCun he helped maintain EBLearn, a C++ deep learning library, in the period before 2012.[11][1] This open-source and academic work, including his later maintenance of the Torch7 scientific computing framework, helped bring him to the attention of Facebook's new AI lab.[1][11]
After completing his master's degree, Chintala worked for a period outside research before returning to it. Press accounts state that he held a software role at Amazon and then joined MuseAmi, a startup where he built deep learning models for mobile devices that combined music and vision tasks, before his open-source contributions drew the interest of Yann LeCun and a recruitment offer from Facebook.[10][15]
Chintala joined Facebook AI Research around 2014 and remained at the company, which renamed its AI organization Meta AI, for about eleven years until late 2025.[2][5] He announced his departure in November 2025, writing on his personal blog that his last day at Meta was November 6, 2025, which ended an eleven-year tenure during which FAIR grew from a young lab into one of the most prominent industrial AI research groups.[1][5][6] He has described 2015 and 2016 as among the most productive and professionally enjoyable years of his career.[1]
Within FAIR he worked on generative models and other research before and during the PyTorch effort, and his name appears on Meta research output as varied as GAN papers and, later, the Llama 3 model report.[9] He maintained Torch7, which was used by organizations including Google DeepMind, Twitter, and Facebook, and he built infrastructure such as convnet-benchmarks, a suite that from 2015 to 2017 served as a reference benchmark used by hardware vendors including Nvidia, AMD, and Intel.[1] His defining contribution at the company, however, was PyTorch.[2][5]
PyTorch grew out of the older Torch library, which used a Lua front end on top of C and CUDA backends.[8] In 2016, Adam Paszke, then a student at the University of Warsaw, reached out to Chintala about an internship, and Chintala invited him to help build a next-generation framework with a Python-centric design; Sam Gross joined full time, and the small team, which also included Gregory Chanan, assembled the first version in a matter of months.[2][8] The project had an initial release in 2016 and a public beta in early 2017.[2][8]
Chintala led PyTorch for close to eight years.[5][6] Under his direction the framework became known for its imperative, "define-by-run" style and dynamic computation graphs, which made experimentation feel natural to researchers, and it steadily displaced rivals in academic work before spreading into production.[8] Coverage of his departure credits him with taking PyTorch "from nothing to 90%+ adoption" across AI research and industry, with the framework powering systems ranging from large language models to recommendation and self-driving stacks.[6][8] By the time of the project's move to a foundation, its maintainers reported that PyTorch had drawn more than 2,400 contributors and had been adopted by some 18,000 organizations since its 2016 release.[12]
The design and engineering of the framework were described in the 2019 NeurIPS paper "PyTorch: An Imperative Style, High-Performance Deep Learning Library," on which Chintala is the final author alongside Adam Paszke, Sam Gross, Francisco Massa, Gregory Chanan, Edward Yang, and roughly twenty other contributors.[17] The paper became one of the most cited works in machine learning systems.[9][17]
In September 2022, Meta transferred PyTorch's governance to the independent PyTorch Foundation under the Linux Foundation, a move Chintala announced and that placed the project under a neutral board including AMD, Amazon Web Services, Google Cloud, Meta, Microsoft, and Nvidia.[12][8] The stated aim was to give a project with foundational importance a neutral home so that decisions would be made transparently by a diverse group of stakeholders.[12] PyTorch continued to evolve as an open project, absorbing Caffe2 and shipping the compiler-focused PyTorch 2.0 release in 2023.[8]
The table below summarizes key milestones in PyTorch's history.
| Year | Milestone |
|---|---|
| 2016 | PyTorch started at FAIR; initial release[2][8] |
| 2017 | Public beta released[2][8] |
| 2018 | Caffe2 merged into PyTorch[8] |
| 2019 | NeurIPS systems paper published[17] |
| 2022 | Governance moved to the PyTorch Foundation under the Linux Foundation[12][8] |
| 2023 | PyTorch 2.0 released with the TorchDynamo compiler[8] |
Apart from PyTorch, Chintala is widely cited for early work on generative models. He co-authored "Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks" (2015) with Alec Radford and Luke Metz, the paper that introduced the DCGAN architecture and a set of design rules that made GAN training more stable for image generation.[3][4] He is also a co-author of "Wasserstein Generative Adversarial Networks" (2017) with Martin Arjovsky and Leon Bottou, a paper that proposed the WGAN training objective to improve stability and reduce mode collapse, and of the Laplacian pyramid GAN paper (LAPGAN, 2015) with Emily Denton, Arthur Szlam, and Rob Fergus.[9][18] Chintala has noted that he eventually stepped back from GAN research, saying he "gave up on GANs after failing to make them stable training algorithms."[1]
The table below lists several of his most cited works.
| Year | Work | Co-authors (selected) | Topic |
|---|---|---|---|
| 2015 | LAPGAN | Emily Denton, Arthur Szlam, Rob Fergus | Laplacian pyramid image generation[9] |
| 2015 | DCGAN | Alec Radford, Luke Metz | Convolutional GAN architecture[3][4] |
| 2017 | Wasserstein GAN | Martin Arjovsky, Leon Bottou | Stable GAN training objective[18] |
| 2019 | PyTorch systems paper | Adam Paszke, Sam Gross and others | Deep learning framework design[17] |
Beyond generative models, his research interests have spanned object and human detection, video generative modeling, AI for video games, and machine learning systems.[1] In the 2020s he turned attention to home robotics, contributing to projects such as "On Bringing Robots Home," Robot Utility Models, Dexterity from Touch, CLIP-Fields, and Holo-Dex.[1] He has also contributed to open-source infrastructure beyond PyTorch, including the convnet-benchmarks project and earlier maintenance of EBLearn and Torch7.[1]
In a blog post dated November 6, 2025, Chintala wrote that he had begun planning his exit around November 2024, that walking away from PyTorch was among the hardest decisions he had made, and that curiosity ultimately drove him to try something outside Meta.[1][6] He framed his next step as wanting to do "something small," "something new," and "something I don't fully understand yet," while pledging to stay involved with PyTorch as a community member.[1] In the post he thanked early FAIR colleagues including Yann LeCun, Rob Fergus, Remi Denton, Leon Bottou, and Alec Radford, and core PyTorch contributors including Adam Paszke, Sam Gross, and Gregory Chanan.[1]
After leaving Meta, Chintala joined Thinking Machines Lab, the startup founded in 2025 by former OpenAI CTO Mira Murati.[5][7] On January 15, 2026, Murati announced on the social platform X that Chintala would serve as the company's chief technology officer, leading its AI infrastructure, research, and technical direction.[7][15] His own website lists his current affiliations as Thinking Machines Lab and NYU, focused on AI infrastructure, AI research, and robotics.[1] Thinking Machines Lab raised about 2 billion dollars in a 2025 seed round at a valuation reported around 12 billion dollars, employed roughly 140 people, and had released its first product by early 2026, by which point it had also secured a large cloud agreement with Google for access to Nvidia GB300 chips.[2][13]
Chintala's prominence rests primarily on PyTorch, which is frequently described as one of the most influential pieces of open-source AI software and the foundation for much of the field's research output.[2][8] By the early 2020s, surveys of major conferences such as NeurIPS reported that PyTorch was used in a large majority of submitted papers, and industry estimates placed its adoption among AI practitioners above 90 percent.[6][15] His Google Scholar profile, listing a Meta AI affiliation, reports more than 160,000 citations and an h-index in the low 40s, driven by the PyTorch system paper and his GAN papers, each cited tens of thousands of times.[9][17] He is a frequent speaker on machine learning and open-source AI tooling.[14] As an angel investor he has backed AI companies including Runway, 1X, Osmo, Anthropic, Together.ai, and Lepton.[1]