Pieter Abbeel
Last reviewed
May 4, 2026
Sources
22 citations
Review status
Source-backed
Revision
v2 ยท 3,759 words
Improve this article
Add missing citations, update stale details, or suggest a clearer explanation.
Last reviewed
May 4, 2026
Sources
22 citations
Review status
Source-backed
Revision
v2 ยท 3,759 words
Add missing citations, update stale details, or suggest a clearer explanation.
Pieter Abbeel (born 1977) is a Belgian-American computer scientist, professor of electrical engineering and computer sciences at the University of California, Berkeley, co-founder and former president of Covariant, and an Amazon Distinguished Scientist. He is a leading figure in robotics and deep reinforcement learning, best known for his work on apprenticeship learning, autonomous helicopter aerobatics, and a series of policy optimization algorithms (TRPO, GAE, PPO, HER, and domain randomization) that have shaped modern robot learning.
Abbeel directs the Berkeley Robot Learning Lab and is a co-director of the Berkeley Artificial Intelligence Research (BAIR) Lab. He completed his Ph.D. at Stanford University in 2008 as the first doctoral student of Andrew Ng, where his dissertation on apprenticeship learning enabled an autonomous helicopter to perform aerobatic maneuvers that previously required expert human pilots. He joined the Berkeley faculty later that year and has supervised a generation of researchers who went on to lead some of the most influential AI organizations in the world, including OpenAI co-founder John Schulman, Stanford professor Chelsea Finn, and Berkeley professor Sergey Levine.
In 2014 Abbeel co-founded Gradescope, an AI-assisted grading platform acquired by Turnitin in 2018. He spent 2016 to 2017 as a research scientist at OpenAI before launching the robotics company Embodied Intelligence, which rebranded as Covariant in early 2020. In August 2024 Amazon hired Abbeel and most of Covariant's research team and licensed the company's robotics foundation models in a transaction widely described as a reverse acqui-hire. Abbeel was named the recipient of the 2021 ACM Prize in Computing for his contributions to robot learning.
| Born | 1977, Antwerp, Belgium |
| Nationality | Belgian-American |
| Education | KU Leuven (M.S. Electrical Engineering, 2000); Stanford University (Ph.D. Computer Science, 2008) |
| Doctoral advisor | Andrew Ng |
| Known for | Apprenticeship learning, autonomous helicopter aerobatics, Trust Region Policy Optimization, Proximal Policy Optimization, Generalized Advantage Estimation, Hindsight Experience Replay, domain randomization, Covariant |
| Institutions | UC Berkeley (2008 to present); OpenAI (2016 to 2017); Covariant (2017 to 2024); Amazon (2024 to present) |
| Notable students | John Schulman, Chelsea Finn, Aravind Srinivas; postdoc Sergey Levine |
| Awards | MIT Technology Review TR35 (2011), Sloan Research Fellowship (2011), NSF CAREER (2014), PECASE (2016), IEEE Fellow (2018), ACM Prize in Computing (2021), IEEE Kiyo Tomiyasu Award (2022) |
| Website | people.eecs.berkeley.edu/~pabbeel |
Pieter Abbeel was born in 1977 in Antwerp, Belgium, and grew up in the suburb of Brasschaat. He pursued his undergraduate and graduate studies in electrical engineering at the Katholieke Universiteit Leuven (KU Leuven), receiving his master's degree in 2000. His early interests centered on the mathematical underpinnings of control and signal processing, which would later inform his approach to learning-based robotics.
After KU Leuven, Abbeel moved to California for doctoral studies at Stanford University. He joined the lab of Andrew Ng, who had recently arrived at Stanford as a first-year assistant professor, and became Ng's first Ph.D. student. Abbeel completed his doctorate in computer science in 2008. His dissertation, titled "Apprenticeship Learning and Reinforcement Learning with Application to Robotic Control," proposed methods for learning controllers from expert demonstrations and inverse reward inference, and applied them to one of the most demanding test beds in autonomous control: an autonomous radio-controlled helicopter.
Working with Ng and fellow Ph.D. student Adam Coates, Abbeel built a system that learned helicopter aerobatics from demonstrations by an expert R/C pilot. The Stanford autonomous helicopter performed in-place flips and rolls, loops, hurricanes, tic-tocs, chaos, and even auto-rotation landings. The team's results were captured in the journal article "Autonomous Helicopter Aerobatics through Apprenticeship Learning" published in the International Journal of Robotics Research in 2010, and remain a landmark demonstration of learning from demonstration applied to high-dimensional, dynamically unstable systems.
Abbeel joined the Department of Electrical Engineering and Computer Sciences at UC Berkeley in 2008 as an assistant professor. He founded the Berkeley Robot Learning Lab and quickly built a research program that combined classical control with then-emerging deep learning techniques. He was promoted to associate professor with tenure and then to full professor, achieving the rank of full professor in 2017.
At Berkeley, Abbeel directs the Berkeley Robot Learning Lab and serves as a co-director of the Berkeley Artificial Intelligence Research (BAIR) Lab, the umbrella research center that brings together more than fifty faculty members across machine learning, computer vision, natural language processing, and robotics. He has also been affiliated with the Center for Human-Compatible Artificial Intelligence (CHAI), the CITRIS and Banatao Institute, and the Berkeley Institute for Data Science.
Abbeel's lab has trained many of the people who define modern AI research and the AI startup ecosystem. By his own reckoning, his former students and postdocs have founded or co-founded more than a dozen AI companies, including OpenAI, Covariant, Physical Intelligence, Perplexity, and Embodied Intelligence.
| Year | Role |
|---|---|
| 2000 | M.S. in Electrical Engineering, KU Leuven |
| 2008 | Ph.D. in Computer Science, Stanford University, advisor Andrew Ng |
| 2008 | Assistant Professor, EECS, UC Berkeley; founded Berkeley Robot Learning Lab |
| 2014 | Co-founded Gradescope |
| 2016 to 2017 | Research Scientist, OpenAI (on leave from UC Berkeley) |
| 2017 | Promoted to Full Professor at UC Berkeley; co-founded Embodied Intelligence (later Covariant) |
| 2018 | Gradescope acquired by Turnitin |
| 2020 | Covariant emerged from stealth and launched commercial product |
| 2021 | Launched The Robot Brains Podcast; investment partner at AIX Ventures |
| 2024 | Joined Amazon as Distinguished Scientist after Amazon's licensing and hiring deal with Covariant |
| 2025 | Appointed head of Amazon's large language model effort within its AGI organization (per published reporting) |
Abbeel's research program centers on giving robots the ability to learn skills from data, with a particular focus on combining reinforcement learning, imitation learning, and meta-learning to overcome the data inefficiency that has historically limited robot learning. His group has produced foundational algorithms used across both academia and industry.
Abbeel's earliest and best-known contributions are in apprenticeship learning, the problem of training an agent to behave well by observing an expert. With Ng he developed methods for inverse reinforcement learning that infer a reward function from demonstrations, then optimize a policy with respect to that inferred reward. The techniques scaled to tasks far beyond what was then possible with hand-engineered reward functions, and they underpinned the Stanford autonomous helicopter that performed expert-level aerobatics. The research established that demonstrations can substitute for carefully designed reward signals when the underlying task is too complex for a human to specify directly.
In 2015 Abbeel and his Ph.D. student John Schulman published Trust Region Policy Optimization at the International Conference on Machine Learning. The full author list was John Schulman, Sergey Levine, Philipp Moritz, Michael I. Jordan, and Pieter Abbeel. TRPO formulated policy gradient updates as a constrained optimization problem with a trust region defined by the Kullback to Leibler divergence between successive policies, providing approximate monotonic improvement guarantees. The algorithm became a workhorse for continuous control tasks and inspired the next generation of policy optimization methods.
The same Berkeley group introduced Generalized Advantage Estimation in 2015 ("High-Dimensional Continuous Control Using Generalized Advantage Estimation," Schulman, Moritz, Levine, Jordan, Abbeel; ICLR 2016). GAE provides an exponentially weighted estimator of the advantage function that interpolates between high-bias one-step estimates and high-variance Monte Carlo returns, dramatically improving the stability of policy gradient learning. GAE remains a standard component of modern actor-critic and PPO implementations.
In July 2017, after Abbeel had taken his leave at OpenAI, Schulman, Filip Wolski, Prafulla Dhariwal, Alec Radford, and Oleg Klimov published Proximal Policy Optimization. PPO simplified TRPO by replacing the hard trust region constraint with a clipped surrogate objective that could be optimized with first-order methods. While Abbeel was not an author on the PPO paper itself, the algorithmic lineage came directly out of his Berkeley group's earlier TRPO and GAE work. PPO has become one of the most widely used reinforcement learning algorithms in the world, and is the algorithm used in the reinforcement learning from human feedback pipelines that train modern large language models.
During his year at OpenAI, Abbeel co-authored "Hindsight Experience Replay" (Andrychowicz, Wolski, Ray, Schneider, Fong, Welinder, McGrew, Tobin, Abbeel, Zaremba; NeurIPS 2017). HER tackles the sparse-reward problem in goal-directed RL by relabeling failed trajectories with the goals that were actually achieved, treating every roll-out as a successful demonstration of some goal. The method made it possible to train robotic manipulation policies on tasks like pushing, sliding, and pick-and-place using only binary success rewards.
The same OpenAI period produced "Domain Randomization for Transferring Deep Neural Networks from Simulation to the Real World" (Tobin, Fong, Ray, Schneider, Zaremba, Abbeel; IROS 2017). Domain randomization addresses the sim-to-real gap by randomizing visual and physical parameters in simulation so that the real world appears to the learned model as just another randomization. The approach became a cornerstone technique for training vision-based controllers in simulation and deploying them on physical robots.
In 2017 Abbeel and his student Chelsea Finn introduced Model-Agnostic Meta-Learning at ICML (Finn, Abbeel, Levine, 2017). MAML formulates few-shot learning as finding initial parameters that can be quickly adapted to a new task with just a few gradient steps. The framework is task-agnostic and applies to supervised learning, reinforcement learning, and beyond. MAML became a defining algorithm of the modern meta-learning literature and continues to inspire follow-up work in personalization, federated learning, and rapid robot adaptation.
In 2021 Abbeel co-authored "Decision Transformer: Reinforcement Learning via Sequence Modeling" (Chen, Lu, Rajeswaran, Lee, Grover, Laskin, Abbeel, Srinivas, Mordatch; NeurIPS 2021). The paper recast offline reinforcement learning as a conditional sequence modeling problem, training a transformer on trajectories conditioned on the desired return. The Decision Transformer reframing has been highly influential in connecting modern transformer architectures with sequential decision making.
Abbeel's group also contributed to deep Q-learning research, robot manipulation policies that learn end-to-end from pixels, large-scale video pretraining for robotics, diffusion models for vision and control, and self-supervised representation learning for RL. The Berkeley Robot Learning Lab has, over the years, demonstrated robots that fold towels, tie surgical sutures, sort socks, and perform delicate assembly tasks based on learned models rather than scripted controllers.
In April 2016 OpenAI announced that Abbeel was joining the new research lab, taking a leave from his Berkeley faculty position. He worked at OpenAI as a research scientist focused on robotics and deep reinforcement learning, contributing to the foundational papers on Hindsight Experience Replay and domain randomization mentioned above. Abbeel left OpenAI in 2017 to start a robotics company that would eventually become Covariant.
Many of his former students and collaborators stayed at OpenAI for years afterward, including John Schulman, who became one of OpenAI's co-founders and led much of the company's reinforcement learning research before later joining other AI labs.
In 2017 Abbeel co-founded a Berkeley-based robotics startup originally named Embodied Intelligence and later rebranded as Covariant. His co-founders were three former Berkeley researchers: Peter Chen (CEO), Rocky Duan (CTO), and Tianhao Zhang. Abbeel served as president and chief scientist while remaining a professor at Berkeley.
Covariant builds AI software for industrial robots, with an early focus on the warehouse pick-and-place problem of identifying, grasping, and packing arbitrary items at high speed. The company's flagship product, Covariant Brain, ran on top of off-the-shelf industrial arms from manufacturers such as ABB, Knapp, and Kuka. The company emerged from stealth in January 2020 with deployments at customers including the German pharmaceutical wholesaler Obeta. Covariant later positioned itself around "foundation models for robotics" with its RFM-1 large-scale model, framing the warehouse work as a stepping stone toward general-purpose robot intelligence.
| Round | Date | Amount | Lead investors |
|---|---|---|---|
| Seed | 2017 | approximately $7 million | Amplify Partners and others |
| Series A | 2018 | approximately $20 million | Index Ventures (with returning seed investors) |
| Series B | May 2020 | $40 million | Index Ventures |
| Series C | July 2021 | $80 million | Index Ventures and Radical Ventures (returning), with Temasek and Canada Pension Plan Investment Board (new) |
| Series C extension | April 2023 | $75 million | Radical Ventures and Index Ventures, with Gates Frontier, AIX Ventures, Northgate Capital |
Total disclosed venture funding through 2023 was approximately $222 million.
On August 30, 2024, Amazon announced that it had entered into a non-exclusive license for Covariant's robotics foundation models and had hired Pieter Abbeel, Peter Chen, Rocky Duan, and roughly a quarter of Covariant's employees. Abbeel joined Amazon as a Distinguished Scientist while keeping his professorship at Berkeley. The transaction was widely described in the press as a "reverse acqui-hire," a structure that allows a large company to take key personnel and license core technology without triggering the antitrust scrutiny that a full acquisition would attract. According to a 2025 whistleblower complaint subsequently reported in the press, the deal was valued at approximately $380 million up front with an additional $20 million licensing fee due one year after closing, well below Covariant's prior venture valuation.
Covariant continued operating after the deal, with former chief operating officer Ted Stinson becoming CEO and co-founder Tianhao Zhang remaining at the company. The deal followed a similar pattern of reverse acqui-hires by large AI companies in 2024, including Microsoft's hiring of much of Inflection AI's leadership and Google's hiring of much of Character.AI's leadership.
In 2014, while teaching large undergraduate AI classes at Berkeley, Abbeel co-founded Gradescope with three of his graduate students and former teaching assistants: Arjun Singh, Sergey Karayev, and Ibrahim Awwal. The company began as a tool called Pandagrader that used computer vision and machine learning to streamline the grading of handwritten exams and homework. Gradescope grew quickly inside higher education, eventually serving more than 600 institutions including over half of Ivy League universities and most top R1 research universities.
On October 5, 2018, Turnitin acquired Gradescope. The acquisition marked Turnitin's first formal expansion into STEM education tools. Gradescope continued to operate as a Turnitin brand after the acquisition.
In 2021 Abbeel joined the venture capital firm AIX Ventures as an investing partner. AIX is an early-stage venture firm focused on AI-first startups. Abbeel has also served as an advisor to a range of AI companies and has appeared in funding rounds for AI startups founded by his students and former colleagues. He is also affiliated with The House Fund, a Berkeley-focused early-stage venture firm.
Abbeel teaches graduate and undergraduate courses on robotics and AI at Berkeley. His signature course is CS287, "Advanced Robotics," which covers planning, control, estimation, and learning for robotic systems. He has also taught Berkeley's deep unsupervised learning course (CS294-158) and helped launch the Berkeley deep reinforcement learning course (CS294-112), the predecessor to the present-day CS285 course taught by Sergey Levine. Many of his lecture videos and course materials are publicly available, and he has contributed to online courses on edX and through his podcast.
Abbeel's lab has produced an unusually large number of researchers who went on to found or lead influential AI organizations.
| Name | Role with Abbeel | Subsequent affiliation |
|---|---|---|
| John Schulman | Ph.D. student (graduated 2016) | Co-founder of OpenAI; later joined Anthropic, then a new venture |
| Chelsea Finn | Ph.D. student (graduated 2018), co-advised with Sergey Levine | Assistant professor at Stanford; co-founder of Physical Intelligence |
| Sergey Levine | Postdoctoral researcher | Professor at UC Berkeley; co-founder of Physical Intelligence |
| Aravind Srinivas | Ph.D. student | Co-founder and CEO of Perplexity AI |
| Igor Mordatch | Postdoctoral researcher | Research scientist at OpenAI, then Google DeepMind |
| Aviv Tamar | Ph.D. student / postdoc | Faculty at Technion |
| Peter Chen | Ph.D. student | Co-founder and former CEO of Covariant |
| Rocky Duan | Ph.D. student | Co-founder and former CTO of Covariant |
| Tianhao Zhang | Researcher in lab | Co-founder of Covariant |
| Marvin Zhang, Aviv Tamar, Lerrel Pinto, Coline Devin, Roberto Calandra, Carlos Florensa | Group members | Faculty positions and research roles across academia and industry |
Abbeel has often noted in interviews that the success of his former students reflects Berkeley's broader graduate culture as much as any individual mentorship style.
| Year | Award |
|---|---|
| 2008 | Stanford SAIL Outstanding Paper Award |
| 2011 | MIT Technology Review TR35 (Innovators Under 35) |
| 2011 | Sloan Research Fellowship |
| 2011 | Office of Naval Research Young Investigator Program (ONR YIP) |
| 2014 | NSF CAREER Award |
| 2014 | DARPA Young Faculty Award |
| 2016 | Presidential Early Career Award for Scientists and Engineers (PECASE) |
| 2016 | Air Force Office of Scientific Research Young Investigator Program (AFOSR YIP) |
| 2018 | IEEE Fellow, for contributions to apprenticeship and reinforcement learning for robotics and autonomous systems |
| 2021 | ACM Prize in Computing, for contributions to robot learning, including learning from demonstrations and deep reinforcement learning for robotic control |
| 2022 | IEEE Kiyo Tomiyasu Award |
Abbeel has also won multiple best paper awards at top venues, including ICML, NeurIPS, ICLR, ICRA, IROS, and CoRL.
In March 2021 Abbeel launched The Robot Brains Podcast, a weekly show in which he interviews leading researchers and entrepreneurs in AI and robotics. The first episode featured Andrej Karpathy, then director of AI and Autopilot Vision at Tesla. Subsequent guests have included Yann LeCun, Geoffrey Hinton, Daphne Koller, Demis Hassabis, Ilya Sutskever, Fei-Fei Li, and Jeff Dean among many others. The podcast is produced in partnership with Covariant and is distributed across major podcast platforms and on YouTube.
Abbeel is also an active public speaker. He has delivered keynotes at NeurIPS, ICML, ICLR, ICRA, IROS, and the World Economic Forum, and his work has been covered in The New York Times, The Wall Street Journal, BBC, Wired, Bloomberg, MIT Technology Review, and many other outlets.
Abbeel lives in the San Francisco Bay Area with his family. He holds dual Belgian and American citizenship and speaks fluent Dutch and English.
Abbeel is one of the central figures in the modern field of deep reinforcement learning for robotics. The lineage of policy optimization algorithms that runs from his Berkeley group's TRPO and GAE papers in 2015, through the OpenAI PPO paper in 2017, into the modern reinforcement learning from human feedback pipelines that align large language models, traces directly back to research conducted in his lab. His work on apprenticeship learning and inverse reinforcement learning helped to define the modern approach to learning controllers from demonstrations rather than from hand-engineered reward signals.
Beyond his own publications, Abbeel's most lasting contribution may be the people he has trained. The list of his former students and postdocs reads like a who's who of contemporary AI research and entrepreneurship: John Schulman led RL at OpenAI and was one of its co-founders; Sergey Levine became one of the most prolific reinforcement learning researchers in the world as a Berkeley professor; Chelsea Finn carried meta-learning research to Stanford and on to Physical Intelligence, the physical intelligence startup developing a general-purpose robot foundation model; Aravind Srinivas built the AI search engine Perplexity. Many of these researchers have themselves trained subsequent generations of students, multiplying Abbeel's intellectual influence on the field.
Abbeel's commercial work has also reshaped the robotics industry. Covariant pioneered the idea of foundation models for robotic manipulation, and its acquisition into Amazon in 2024 placed his team at the center of one of the largest commercial robotics deployments in the world, including the roomba consumer robotics market and Amazon's vast warehouse operations. Together with the wave of AI-first robotics companies founded by his former students, Abbeel's work has helped move robot learning from a research curiosity to a multi-billion-dollar industry.