Barret Zoph
Last reviewed
Jun 5, 2026
Sources
26 citations
Review status
Source-backed
Revision
v2 · 2,091 words
Improve this article
Add missing citations, update stale details, or suggest a clearer explanation.
Last reviewed
Jun 5, 2026
Sources
26 citations
Review status
Source-backed
Revision
v2 · 2,091 words
Add missing citations, update stale details, or suggest a clearer explanation.
Barret Zoph is an American artificial intelligence researcher and engineer best known for pioneering neural architecture search (NAS) at Google Brain and for helping build ChatGPT as a vice president of research at OpenAI [1][2]. After leaving OpenAI in 2024 he co-founded Thinking Machines Lab with former OpenAI chief technology officer Mira Murati, serving as the startup's co-founder and chief technology officer, before returning to OpenAI in January 2026 to lead the company's enterprise business [3][4][5].
Zoph's research spans automated machine learning, large-scale language modeling, and the post-training methods used to align and adapt foundation models. His early work with Quoc V. Le framed the design of neural network architectures as a learning problem in its own right, producing the NAS and NASNet methods that influenced a generation of AutoML research [6][7]. He also became one of the principal researchers behind sparse mixture of experts language models, co-authoring the Switch Transformer and several follow-up papers on training large sparse models stably [14][22]. At OpenAI he co-led the team responsible for turning raw language models into the conversational system released as ChatGPT [1][8]. According to his Google Scholar profile, his publications have been cited more than 130,000 times, with an h-index of 62 [2].
Zoph earned a Bachelor of Science in computer science from the University of Southern California (USC) in 2016 [9]. As an undergraduate he worked with USC faculty members Kevin Knight and David Kempe on computer science research [9]. He has said that getting involved in research as an undergraduate, and finding the right faculty mentor, made a large difference to his career, and that learning how to build the infrastructure for and train deep learning systems became a skill he relied on throughout his later work [9]. Before joining Google he carried out research at USC's Information Sciences Institute (ISI), collaborating with Kevin Knight and Daniel Marcu on statistical machine translation, a field for which ISI was well known [1][9].
That ISI work produced one of his first widely cited papers, "Transfer Learning for Low-Resource Neural Machine Translation," presented at EMNLP 2016 with Deniz Yuret, Jonathan May, and Kevin Knight. The method first trained a high-resource language pair to create a parent model, then transferred some of its learned parameters to initialize and constrain training on a low-resource pair, improving baseline neural translation systems by an average of 5.6 BLEU across four low-resource language pairs [23]. After graduating in 2016 he joined Google Brain, entering through the Google Brain Residency program [6][10].
At Google Brain, Zoph worked as a research scientist, later a staff research scientist, on automated machine learning and, subsequently, on large sparse language models [1]. His best-known contribution from this period is neural architecture search. In the 2017 paper "Neural Architecture Search with Reinforcement Learning," written with Quoc V. Le, he used a recurrent neural network controller, trained with reinforcement learning, to generate the descriptions of candidate networks and to maximize their validation accuracy [7]. On the CIFAR-10 image dataset the method discovered architectures competitive with the best hand-designed models of the time, and on the Penn Treebank language-modeling benchmark it produced a recurrent cell that outperformed the standard LSTM [7].
Zoph and collaborators extended the idea in "Learning Transferable Architectures for Scalable Image Recognition" (2018), which introduced the NASNet design. Rather than search over a whole network, the method searched for a reusable convolutional cell on a small dataset and then stacked copies of that cell to build larger models, allowing an architecture found on CIFAR-10 to transfer to the much larger ImageNet benchmark [11]. He also contributed to data-augmentation research, co-authoring AutoAugment and RandAugment, which learn augmentation policies from data [12][13], and to work on large mixture of experts language models. His architecture-search ideas fed into Google's AutoML products and influenced later efficient-network research such as EfficientNet [6][11].
In his later years at Google, Zoph focused on training large sparse models. With William Fedus and Noam Shazeer he co-authored the Switch Transformer, which simplified mixture-of-experts routing so that each token was sent to a single expert, allowing models to scale to a trillion parameters while keeping the computational cost per token roughly constant and reaching up to seven times faster pre-training than dense baselines of comparable cost [14]. He followed this with "ST-MoE: Designing Stable and Transferable Sparse Expert Models" (2022), which addressed the training instability and fine-tuning difficulties of sparse models and scaled a sparse model to 269 billion parameters at a compute cost comparable to a 32-billion-parameter dense Transformer, the first time a sparse model reached state-of-the-art transfer-learning results across a broad set of language tasks [22].
| Year | Title | Co-authors | Contribution |
|---|---|---|---|
| 2016 | Transfer Learning for Low-Resource Neural Machine Translation | D. Yuret, J. May, K. Knight | Parent-to-child parameter transfer for low-resource translation; +5.6 BLEU on average [23] |
| 2017 | Neural Architecture Search with Reinforcement Learning | Quoc V. Le | RNN controller trained with RL to design networks; founded NAS [7] |
| 2018 | Learning Transferable Architectures for Scalable Image Recognition | V. Vasudevan, J. Shlens, Q. V. Le | Introduced the NASNet transferable-cell design [11] |
| 2019 | AutoAugment: Learning Augmentation Strategies from Data | E. D. Cubuk, D. Mané, V. Vasudevan, Q. V. Le | Learned data-augmentation policies [12] |
| 2020 | RandAugment: Practical Automated Data Augmentation | E. D. Cubuk, J. Shlens, Q. V. Le | Simplified, reduced-search augmentation [13] |
| 2022 | Switch Transformers | W. Fedus, N. Shazeer | Trillion-parameter sparse mixture-of-experts model [14] |
| 2022 | ST-MoE: Designing Stable and Transferable Sparse Expert Models | I. Bello, S. Kumar, N. Du, Y. Huang, J. Dean, N. Shazeer, W. Fedus | Stable 269B-parameter sparse model with state-of-the-art transfer [22] |
| 2023 | GPT-4 Technical Report | OpenAI | Contributor to OpenAI's flagship model [2][8] |
Zoph joined OpenAI in September 2022, shortly before the public launch of ChatGPT [9][15]. He helped build the company's post-training team from scratch alongside John Schulman and others, and rose to vice president of research for post-training [1][15]. The post-training group is responsible for the work that turns a pretrained large language model into a usable assistant, including instruction following, reinforcement learning from human feedback, tool use, evaluations, and safety filtering [1][8]. Zoph's teams trained the models shipped into ChatGPT and the OpenAI API, and his work extended across alignment, search, and multimodality, contributing to releases such as GPT-4 and GPT-4o [1]. He is credited as a contributor on the GPT-4 Technical Report [2][8].
In February 2025, after leaving the company, Zoph and Schulman gave a talk at Stanford titled "ChatGPT and the Art of Post-Training," in which they described having joined OpenAI in September 2022 and pushed to build an aligned chatbot that could be deployed safely, an account that has circulated as one of the more detailed first-person descriptions of how ChatGPT's post-training was developed [24].
On September 25, 2024, Zoph left OpenAI, announcing his departure the same week that Mira Murati said she would step down as chief technology officer and chief research officer Bob McGrew said he would leave [15][16]. In a public note, Zoph described leaving as a difficult, personal decision and called it "a natural point" to explore new opportunities, while emphasizing that the moves were made independently and amicably [15][16]. Chief executive Sam Altman characterized the three exits as separate decisions timed together for a smooth handover [16].
In February 2025, Zoph re-teamed with Murati as a co-founder of Thinking Machines Lab, an AI startup incorporated as a public benefit corporation with the stated goal of making AI systems more widely understood, customizable, and capable [17][18]. The founding group also included John Schulman as chief scientist, Lilian Weng, Andrew Tulloch, and Luke Metz, most of them former OpenAI researchers [17][18]. Zoph served as the company's chief technology officer [3][18]. In July 2025 the startup raised about $2 billion at a roughly $12 billion valuation in a round led by Andreessen Horowitz, with participation from investors including Nvidia, AMD, Cisco, and Jane Street, and in October 2025 it shipped its first product, a model fine-tuning service called Tinker [18][19].
On January 14, 2026, OpenAI announced that Zoph would return to the company along with Luke Metz and Sam Schoenholz, all three departing Thinking Machines Lab [4][20]. Fidji Simo, OpenAI's chief executive of applications, said she was "excited to welcome Barret Zoph, Luke Metz, and Sam Schoenholz back to OpenAI," adding that the move had "been in the works for several weeks" [4][20]. Announcing the split roughly an hour before OpenAI confirmed the returns, Murati said the company had "parted ways with Barret Zoph" and named Soumith Chintala, the creator of PyTorch, as the new chief technology officer of Thinking Machines Lab [4][25]. Reporting by Wired indicated the separation was not amicable: a source close to Thinking Machines alleged that Zoph had shared confidential company information with competitors, a claim Wired said it could not verify and that OpenAI said it did not share [4][25][26]. Coverage of the move noted that Zoph's return came as OpenAI reorganized around enterprise growth, with Zoph reporting to Simo and taking on leadership of the company's enterprise and commercial business rather than a purely research role [5][21][25].
Zoph is among the most cited researchers in automated machine learning. His Google Scholar profile lists more than 130,000 citations and an h-index of 62, with the 2017 neural architecture search paper alone accumulating thousands of citations [2][7]. His architecture-search and sparse-model work is frequently cited as foundational to AutoML and to the design of large mixture-of-experts language models [6][22]. USC has highlighted him as an alumnus whose work helped pave the way for ChatGPT [9].