Pieter Abbeel

People Reinforcement Learning Robotics

21 min read

Updated Jun 21, 2026

Suggest edit History Talk

RawGraph

Last edited

Jun 21, 2026

Fact-checked

In review queue

Sources

25 citations

Revision

v4 · 4,177 words

Fact-checks are independent of edits: a reviewer re-verifies the article against its sources and stamps the date. How we verify

Pieter Abbeel (born 1977) is a Belgian-American computer scientist and a professor of electrical engineering and computer sciences at the University of California, Berkeley, where he directs the Berkeley Robot Learning Lab and co-directs the Berkeley Artificial Intelligence Research (BAIR) Lab.^[1] He co-founded the robotics company Covariant in 2017, co-authored the foundational 2020 paper that launched modern diffusion models ("Denoising Diffusion Probabilistic Models"), and is an Amazon Distinguished Scientist who in December 2025 was named to lead Amazon's frontier large language model research.^[3]^[23]^[24] His Google Scholar profile lists more than 267,000 citations, making him one of the most cited researchers in robotics and deep reinforcement learning.^[21]

Abbeel is best known for his work on apprenticeship learning, autonomous helicopter aerobatics, and a series of policy optimization algorithms (TRPO, GAE, PPO, HER, and domain randomization) that have shaped modern robot learning.^[3] He completed his Ph.D. at Stanford University in 2008 as the first doctoral student of Andrew Ng, where his dissertation on apprenticeship learning enabled an autonomous helicopter to perform aerobatic maneuvers that previously required expert human pilots.^[3] He joined the Berkeley faculty later that year and has supervised a generation of researchers who went on to lead some of the most influential AI organizations in the world, including OpenAI co-founder John Schulman, Stanford professor Chelsea Finn, and Berkeley professor Sergey Levine.^[3]

In 2014 Abbeel co-founded Gradescope, an AI-assisted grading platform acquired by Turnitin in 2018.^[17] He spent 2016 to 2017 as a research scientist at OpenAI before launching the robotics company Embodied Intelligence, which rebranded as Covariant in early 2020.^[13]^[4] In August 2024 Amazon hired Abbeel and most of Covariant's research team and licensed the company's robotics foundation models in a transaction widely described as a reverse acqui-hire.^[18]^[19] Abbeel was named the recipient of the 2021 ACM Prize in Computing for his contributions to robot learning.^[14]

Infobox


Born	1977, Antwerp, Belgium
Nationality	Belgian-American
Education	KU Leuven (M.S. Electrical Engineering, 2000); Stanford University (Ph.D. Computer Science, 2008)
Doctoral advisor	Andrew Ng
Known for	Apprenticeship learning, autonomous helicopter aerobatics, Trust Region Policy Optimization, Proximal Policy Optimization, Generalized Advantage Estimation, Hindsight Experience Replay, domain randomization, Denoising Diffusion Probabilistic Models, Covariant
Institutions	UC Berkeley (2008 to present); OpenAI (2016 to 2017); Covariant (2017 to 2024); Amazon (2024 to present)
Notable students	John Schulman, Chelsea Finn, Aravind Srinivas; postdoc Sergey Levine
Awards	MIT Technology Review TR35 (2011), Sloan Research Fellowship (2011), NSF CAREER (2014), PECASE (2016), IEEE Fellow (2018), ACM Prize in Computing (2021), IEEE Kiyo Tomiyasu Award (2022)
Website	people.eecs.berkeley.edu/~pabbeel

Early life and education

Pieter Abbeel was born in 1977 in Antwerp, Belgium, and grew up in the suburb of Brasschaat. He pursued his undergraduate and graduate studies in electrical engineering at the Katholieke Universiteit Leuven (KU Leuven), receiving his master's degree in 2000.^[3] His early interests centered on the mathematical underpinnings of control and signal processing, which would later inform his approach to learning-based robotics.

After KU Leuven, Abbeel moved to California for doctoral studies at Stanford University. He joined the lab of Andrew Ng, who had recently arrived at Stanford as a first-year assistant professor, and became Ng's first Ph.D. student.^[3] Abbeel completed his doctorate in computer science in 2008. His dissertation, titled "Apprenticeship Learning and Reinforcement Learning with Application to Robotic Control," proposed methods for learning controllers from expert demonstrations and inverse reward inference, and applied them to one of the most demanding test beds in autonomous control: an autonomous radio-controlled helicopter.^[3]

Working with Ng and fellow Ph.D. student Adam Coates, Abbeel built a system that learned helicopter aerobatics from demonstrations by an expert R/C pilot.^[5] The Stanford autonomous helicopter performed in-place flips and rolls, loops, hurricanes, tic-tocs, chaos, and even auto-rotation landings.^[5] The team's results were captured in the journal article "Autonomous Helicopter Aerobatics through Apprenticeship Learning" published in the International Journal of Robotics Research in 2010, and remain a landmark demonstration of learning from demonstration applied to high-dimensional, dynamically unstable systems.^[5]

Academic career at Berkeley

Abbeel joined the Department of Electrical Engineering and Computer Sciences at UC Berkeley in 2008 as an assistant professor.^[1] He founded the Berkeley Robot Learning Lab and quickly built a research program that combined classical control with then-emerging deep learning techniques. He was promoted to associate professor with tenure and then to full professor, achieving the rank of full professor in 2017.^[1]

At Berkeley, Abbeel directs the Berkeley Robot Learning Lab and serves as a co-director of the Berkeley Artificial Intelligence Research (BAIR) Lab, the umbrella research center that brings together more than fifty faculty members across machine learning, computer vision, natural language processing, and robotics.^[1] He has also been affiliated with the Center for Human-Compatible Artificial Intelligence (CHAI), the CITRIS and Banatao Institute, and the Berkeley Institute for Data Science.^[1]

Abbeel's lab has trained many of the people who define modern AI research and the AI startup ecosystem. By his own reckoning, his former students and postdocs have founded or co-founded more than a dozen AI companies, including OpenAI, Covariant, Physical Intelligence, Perplexity, and Embodied Intelligence.^[2]

Career timeline

Year	Role
2000	M.S. in Electrical Engineering, KU Leuven
2008	Ph.D. in Computer Science, Stanford University, advisor Andrew Ng
2008	Assistant Professor, EECS, UC Berkeley; founded Berkeley Robot Learning Lab
2014	Co-founded Gradescope
2016 to 2017	Research Scientist, OpenAI (on leave from UC Berkeley)
2017	Promoted to Full Professor at UC Berkeley; co-founded Embodied Intelligence (later Covariant)
2018	Gradescope acquired by Turnitin
2020	Covariant emerged from stealth and launched commercial product; co-authored Denoising Diffusion Probabilistic Models
2021	Launched The Robot Brains Podcast; investment partner at AIX Ventures
2024	Joined Amazon as Distinguished Scientist after Amazon's licensing and hiring deal with Covariant
2025	Named to lead Amazon's frontier large language model research within its AGI organization (December 2025)

Research contributions

Abbeel's research program centers on giving robots the ability to learn skills from data, with a particular focus on combining reinforcement learning, imitation learning, and meta-learning to overcome the data inefficiency that has historically limited robot learning.^[2] His group has produced foundational algorithms used across both academia and industry.

Apprenticeship learning and inverse reinforcement learning

Abbeel's earliest and best-known contributions are in apprenticeship learning, the problem of training an agent to behave well by observing an expert. With Ng he developed methods for inverse reinforcement learning that infer a reward function from demonstrations, then optimize a policy with respect to that inferred reward.^[5] The techniques scaled to tasks far beyond what was then possible with hand-engineered reward functions, and they underpinned the Stanford autonomous helicopter that performed expert-level aerobatics.^[5] The research established that demonstrations can substitute for carefully designed reward signals when the underlying task is too complex for a human to specify directly.

Trust Region Policy Optimization (TRPO)

In 2015 Abbeel and his Ph.D. student John Schulman published Trust Region Policy Optimization at the International Conference on Machine Learning.^[6] The full author list was John Schulman, Sergey Levine, Philipp Moritz, Michael I. Jordan, and Pieter Abbeel.^[6] TRPO formulated policy gradient updates as a constrained optimization problem with a trust region defined by the Kullback to Leibler divergence between successive policies, providing approximate monotonic improvement guarantees.^[6] The algorithm became a workhorse for continuous control tasks and inspired the next generation of policy optimization methods.

Generalized Advantage Estimation (GAE)

The same Berkeley group introduced Generalized Advantage Estimation in 2015 ("High-Dimensional Continuous Control Using Generalized Advantage Estimation," Schulman, Moritz, Levine, Jordan, Abbeel; ICLR 2016).^[7] GAE provides an exponentially weighted estimator of the advantage function that interpolates between high-bias one-step estimates and high-variance Monte Carlo returns, dramatically improving the stability of policy gradient learning.^[7] GAE remains a standard component of modern actor-critic and PPO implementations.

Proximal Policy Optimization (PPO)

In July 2017, after Abbeel had taken his leave at OpenAI, Schulman, Filip Wolski, Prafulla Dhariwal, Alec Radford, and Oleg Klimov published Proximal Policy Optimization.^[8] PPO simplified TRPO by replacing the hard trust region constraint with a clipped surrogate objective that could be optimized with first-order methods.^[8] While Abbeel was not an author on the PPO paper itself, the algorithmic lineage came directly out of his Berkeley group's earlier TRPO and GAE work. PPO has become one of the most widely used reinforcement learning algorithms in the world, and is the algorithm used in the reinforcement learning from human feedback pipelines that train modern large language models.

Hindsight Experience Replay (HER)

During his year at OpenAI, Abbeel co-authored "Hindsight Experience Replay" (Andrychowicz, Wolski, Ray, Schneider, Fong, Welinder, McGrew, Tobin, Abbeel, Zaremba; NeurIPS 2017).^[9] HER tackles the sparse-reward problem in goal-directed RL by relabeling failed trajectories with the goals that were actually achieved, treating every roll-out as a successful demonstration of some goal.^[9] The method made it possible to train robotic manipulation policies on tasks like pushing, sliding, and pick-and-place using only binary success rewards.

Domain randomization

The same OpenAI period produced "Domain Randomization for Transferring Deep Neural Networks from Simulation to the Real World" (Tobin, Fong, Ray, Schneider, Zaremba, Abbeel; IROS 2017).^[10] Domain randomization addresses the sim-to-real gap by randomizing visual and physical parameters in simulation so that the real world appears to the learned model as just another randomization.^[10] The approach became a cornerstone technique for training vision-based controllers in simulation and deploying them on physical robots.

Model-Agnostic Meta-Learning (MAML)

In 2017 Abbeel and his student Chelsea Finn introduced Model-Agnostic Meta-Learning at ICML (Finn, Abbeel, Levine, 2017).^[11] MAML formulates few-shot learning as finding initial parameters that can be quickly adapted to a new task with just a few gradient steps.^[11] The framework is task-agnostic and applies to supervised learning, reinforcement learning, and beyond.^[11] MAML became a defining algorithm of the modern meta-learning literature and continues to inspire follow-up work in personalization, federated learning, and rapid robot adaptation.

Denoising Diffusion Probabilistic Models (DDPM)

In 2020 Abbeel co-authored "Denoising Diffusion Probabilistic Models" with his Berkeley Ph.D. students Jonathan Ho and Ajay Jain, published at NeurIPS 2020.^[23] The paper showed that diffusion models, a class of latent variable models inspired by nonequilibrium thermodynamics, could match or exceed the image quality of generative adversarial networks by training on a simplified weighted variational bound connected to denoising score matching.^[23] On unconditional CIFAR-10 the model achieved an Inception score of 9.46 and a state-of-the-art Frechet Inception Distance of 3.17, and it produced sample quality on 256x256 LSUN comparable to ProgressiveGAN.^[23] The authors reported that they "present high quality image synthesis results using diffusion probabilistic models," a result that launched the modern diffusion-model wave underpinning later systems such as Stable Diffusion, DALL-E 2, and Imagen.^[23] DDPM has become one of the most cited papers in generative modeling and is the most direct demonstration that Abbeel's research footprint extends well beyond reinforcement learning into generative AI.

Decision Transformer

In 2021 Abbeel co-authored "Decision Transformer: Reinforcement Learning via Sequence Modeling" (Chen, Lu, Rajeswaran, Lee, Grover, Laskin, Abbeel, Srinivas, Mordatch; NeurIPS 2021).^[12] The paper recast offline reinforcement learning as a conditional sequence modeling problem, training a transformer on trajectories conditioned on the desired return.^[12] The Decision Transformer reframing has been highly influential in connecting modern transformer architectures with sequential decision making.

Other notable work

Abbeel's group also contributed to deep Q-learning research, robot manipulation policies that learn end-to-end from pixels, large-scale video pretraining for robotics, diffusion models for vision and control, and self-supervised representation learning for RL.^[21] The Berkeley Robot Learning Lab has, over the years, demonstrated robots that fold towels, tie surgical sutures, sort socks, and perform delicate assembly tasks based on learned models rather than scripted controllers.

What did Pieter Abbeel do at OpenAI?

In April 2016 OpenAI announced that Abbeel was joining the new research lab, taking a leave from his Berkeley faculty position.^[13] The announcement noted that "Pieter and his lab members have been responsible for some of the most striking advances in robot learning and deep reinforcement learning in recent years," and described a plan to combine unsupervised learning with reinforcement learning to address fundamental limitations in then-current RL algorithms.^[13] He worked at OpenAI as a research scientist focused on robotics and deep reinforcement learning, contributing to the foundational papers on Hindsight Experience Replay and domain randomization mentioned above.^[9]^[10] Abbeel left OpenAI in 2017 to start a robotics company that would eventually become Covariant.^[4]

Many of his former students and collaborators stayed at OpenAI for years afterward, including John Schulman, who became one of OpenAI's co-founders and led much of the company's reinforcement learning research before later joining other AI labs.

What is Covariant?

In 2017 Abbeel co-founded a Berkeley-based robotics startup originally named Embodied Intelligence and later rebranded as Covariant.^[4] His co-founders were three former Berkeley researchers: Peter Chen (CEO), Rocky Duan (CTO), and Tianhao Zhang.^[4] Abbeel served as president and chief scientist while remaining a professor at Berkeley.^[4]

Covariant builds AI software for industrial robots, with an early focus on the warehouse pick-and-place problem of identifying, grasping, and packing arbitrary items at high speed.^[4] The company's flagship product, Covariant Brain, ran on top of off-the-shelf industrial arms from manufacturers such as ABB, Knapp, and Kuka.^[4] The company emerged from stealth in January 2020 with deployments at customers including the German pharmaceutical wholesaler Obeta.^[4] On March 11, 2024, Covariant introduced RFM-1, a robotics foundation model with 8 billion parameters trained on text, images, video, robot actions, and physical measurements as a multimodal any-to-any sequence model, framing the warehouse work as a stepping stone toward general-purpose robot intelligence.^[4]^[25]

Funding history

Round	Date	Amount	Lead investors
Seed	2017	approximately $7 million	Amplify Partners and others
Series A	2018	approximately $20 million	Index Ventures (with returning seed investors)
Series B	May 2020	$40 million	Index Ventures
Series C	July 2021	$80 million	Index Ventures and Radical Ventures (returning), with Temasek and Canada Pension Plan Investment Board (new)
Series C extension	April 2023	$75 million	Radical Ventures and Index Ventures, with Gates Frontier, AIX Ventures, Northgate Capital

Total disclosed venture funding through 2023 was approximately $222 million.^[4]

What was the Amazon licensing and hiring deal (August 2024)?

On August 30, 2024, Amazon announced that it had entered into a non-exclusive license for Covariant's robotics foundation models and had hired Pieter Abbeel, Peter Chen, Rocky Duan, and roughly a quarter of Covariant's employees.^[18]^[19] Abbeel joined Amazon as a Distinguished Scientist while keeping his professorship at Berkeley.^[18] The transaction was widely described in the press as a "reverse acqui-hire," a structure that allows a large company to take key personnel and license core technology without triggering the antitrust scrutiny that a full acquisition would attract.^[19] According to a 2025 whistleblower complaint subsequently reported in the press, the deal was valued at approximately $380 million up front with an additional $20 million licensing fee due one year after closing, well below Covariant's prior venture valuation.

In December 2025, as part of a broad reorganization of Amazon's AI leadership, Abbeel was named to lead Amazon's frontier large language model research within its AGI organization while continuing his robotics work.^[24] Covariant continued operating after the 2024 deal, with former chief operating officer Ted Stinson becoming CEO and co-founder Tianhao Zhang remaining at the company.^[18] The deal followed a similar pattern of reverse acqui-hires by large AI companies in 2024, including Microsoft's hiring of much of Inflection AI's leadership and Google's hiring of much of Character.AI's leadership.^[19]

What is Gradescope?

In 2014, while teaching large undergraduate AI classes at Berkeley, Abbeel co-founded Gradescope with three of his graduate students and former teaching assistants: Arjun Singh, Sergey Karayev, and Ibrahim Awwal.^[17] The company began as a tool called Pandagrader that used computer vision and machine learning to streamline the grading of handwritten exams and homework.^[17] Gradescope grew quickly inside higher education, eventually serving more than 600 institutions including over half of Ivy League universities and most top R1 research universities.^[17]

On October 5, 2018, Turnitin acquired Gradescope.^[17] The acquisition marked Turnitin's first formal expansion into STEM education tools.^[17] Gradescope continued to operate as a Turnitin brand after the acquisition.

Investments and advisory roles

In 2021 Abbeel joined the venture capital firm AIX Ventures as an investing partner. AIX is an early-stage venture firm focused on AI-first startups. Abbeel has also served as an advisor to a range of AI companies and has appeared in funding rounds for AI startups founded by his students and former colleagues. He is also affiliated with The House Fund, a Berkeley-focused early-stage venture firm.

Teaching

Abbeel teaches graduate and undergraduate courses on robotics and AI at Berkeley.^[1] His signature course is CS287, "Advanced Robotics," which covers planning, control, estimation, and learning for robotic systems.^[1] He has also taught Berkeley's deep unsupervised learning course (CS294-158) and helped launch the Berkeley deep reinforcement learning course (CS294-112), the predecessor to the present-day CS285 course taught by Sergey Levine.^[1] Many of his lecture videos and course materials are publicly available, and he has contributed to online courses on edX and through his podcast.

Notable students and postdocs

Abbeel's lab has produced an unusually large number of researchers who went on to found or lead influential AI organizations.^[3]

Name	Role with Abbeel	Subsequent affiliation
John Schulman	Ph.D. student (graduated 2016)	Co-founder of OpenAI; later joined Anthropic, then a new venture
Chelsea Finn	Ph.D. student (graduated 2018), co-advised with Sergey Levine	Assistant professor at Stanford; co-founder of Physical Intelligence
Sergey Levine	Postdoctoral researcher	Professor at UC Berkeley; co-founder of Physical Intelligence
Aravind Srinivas	Ph.D. student	Co-founder and CEO of Perplexity AI
Jonathan Ho	Ph.D. student	Co-author of Denoising Diffusion Probabilistic Models; researcher at Google
Igor Mordatch	Postdoctoral researcher	Research scientist at OpenAI, then Google DeepMind
Aviv Tamar	Ph.D. student / postdoc	Faculty at Technion
Peter Chen	Ph.D. student	Co-founder and former CEO of Covariant
Rocky Duan	Ph.D. student	Co-founder and former CTO of Covariant
Tianhao Zhang	Researcher in lab	Co-founder of Covariant
Marvin Zhang, Aviv Tamar, Lerrel Pinto, Coline Devin, Roberto Calandra, Carlos Florensa	Group members	Faculty positions and research roles across academia and industry

Abbeel has often noted in interviews that the success of his former students reflects Berkeley's broader graduate culture as much as any individual mentorship style.

Awards and honors

Year	Award
2008	Stanford SAIL Outstanding Paper Award
2011	MIT Technology Review TR35 (Innovators Under 35)
2011	Sloan Research Fellowship
2011	Office of Naval Research Young Investigator Program (ONR YIP)
2014	NSF CAREER Award
2014	DARPA Young Faculty Award
2016	Presidential Early Career Award for Scientists and Engineers (PECASE)
2016	Air Force Office of Scientific Research Young Investigator Program (AFOSR YIP)
2018	IEEE Fellow, for contributions to apprenticeship and reinforcement learning for robotics and autonomous systems
2021	ACM Prize in Computing, for contributions to robot learning, including learning from demonstrations and deep reinforcement learning for robotic control
2022	IEEE Kiyo Tomiyasu Award

Abbeel has also won multiple best paper awards at top venues, including ICML, NeurIPS, ICLR, ICRA, IROS, and CoRL.

Public engagement

In March 2021 Abbeel launched The Robot Brains Podcast, a weekly show in which he interviews leading researchers and entrepreneurs in AI and robotics.^[20] The first episode featured Andrej Karpathy, then director of AI and Autopilot Vision at Tesla.^[20] Subsequent guests have included Yann LeCun, Geoffrey Hinton, Daphne Koller, Demis Hassabis, Ilya Sutskever, Fei-Fei Li, and Jeff Dean among many others. The podcast is produced in partnership with Covariant and is distributed across major podcast platforms and on YouTube.

Abbeel is also an active public speaker. He has delivered keynotes at NeurIPS, ICML, ICLR, ICRA, IROS, and the World Economic Forum, and his work has been covered in The New York Times, The Wall Street Journal, BBC, Wired, Bloomberg, MIT Technology Review, and many other outlets.

Personal life

Abbeel lives in the San Francisco Bay Area with his family. He holds dual Belgian and American citizenship and speaks fluent Dutch and English.

Influence and legacy

Abbeel is one of the central figures in the modern field of deep reinforcement learning for robotics. The lineage of policy optimization algorithms that runs from his Berkeley group's TRPO and GAE papers in 2015, through the OpenAI PPO paper in 2017, into the modern reinforcement learning from human feedback pipelines that align large language models, traces directly back to research conducted in his lab.^[6]^[8] His work on apprenticeship learning and inverse reinforcement learning helped to define the modern approach to learning controllers from demonstrations rather than from hand-engineered reward signals. His 2020 DDPM paper similarly seeded the diffusion-model revolution that now drives most state-of-the-art image, video, and audio generation.^[23]

Beyond his own publications, Abbeel's most lasting contribution may be the people he has trained. The list of his former students and postdocs reads like a who's who of contemporary AI research and entrepreneurship: John Schulman led RL at OpenAI and was one of its co-founders; Sergey Levine became one of the most prolific reinforcement learning researchers in the world as a Berkeley professor; Chelsea Finn carried meta-learning research to Stanford and on to Physical Intelligence, the physical intelligence startup developing a general-purpose robot foundation model; Aravind Srinivas built the AI search engine Perplexity.^[3] Many of these researchers have themselves trained subsequent generations of students, multiplying Abbeel's intellectual influence on the field.

Abbeel's commercial work has also reshaped the robotics industry. Covariant pioneered the idea of foundation models for robotic manipulation, and its acquisition into Amazon in 2024 placed his team at the center of one of the largest commercial robotics deployments in the world, including the roomba consumer robotics market and Amazon's vast warehouse operations.^[18] Together with the wave of AI-first robotics companies founded by his former students, Abbeel's work has helped move robot learning from a research curiosity to a multi-billion-dollar industry.

References

Pieter Abbeel, faculty page, EECS, UC Berkeley. https://www2.eecs.berkeley.edu/Faculty/Homepages/abbeel.html ↩
Pieter Abbeel, personal homepage. https://people.eecs.berkeley.edu/~pabbeel/ ↩
"Pieter Abbeel," Wikipedia. https://en.wikipedia.org/wiki/Pieter_Abbeel ↩
"Covariant (company)," Wikipedia. https://en.wikipedia.org/wiki/Covariant_(company) ↩
Pieter Abbeel, Adam Coates, and Andrew Y. Ng, "Autonomous Helicopter Aerobatics through Apprenticeship Learning," International Journal of Robotics Research, 2010. https://journals.sagepub.com/doi/abs/10.1177/0278364910371999 ↩
John Schulman, Sergey Levine, Philipp Moritz, Michael I. Jordan, and Pieter Abbeel, "Trust Region Policy Optimization," ICML 2015. https://arxiv.org/abs/1502.05477 ↩
John Schulman, Philipp Moritz, Sergey Levine, Michael I. Jordan, and Pieter Abbeel, "High-Dimensional Continuous Control Using Generalized Advantage Estimation," ICLR 2016. https://arxiv.org/abs/1506.02438 ↩
John Schulman, Filip Wolski, Prafulla Dhariwal, Alec Radford, and Oleg Klimov, "Proximal Policy Optimization Algorithms," 2017. https://arxiv.org/abs/1707.06347 ↩
Marcin Andrychowicz et al., "Hindsight Experience Replay," NeurIPS 2017. https://arxiv.org/abs/1707.01495 ↩
Josh Tobin et al., "Domain Randomization for Transferring Deep Neural Networks from Simulation to the Real World," IROS 2017. https://arxiv.org/abs/1703.06907 ↩
Chelsea Finn, Pieter Abbeel, and Sergey Levine, "Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks," ICML 2017. https://arxiv.org/abs/1703.03400 ↩
Lili Chen et al., "Decision Transformer: Reinforcement Learning via Sequence Modeling," NeurIPS 2021. https://arxiv.org/abs/2106.01345 ↩
"Welcome, Pieter and Shivon!," OpenAI, April 2016. https://openai.com/index/welcome-pieter-and-shivon/ ↩
"UC Berkeley's Pieter Abbeel receives 2021 ACM Prize in Computing," ACM Awards. https://awards.acm.org/about/2021-acm-prize ↩
"Pieter Abbeel and Sanjit Seshia elected 2018 IEEE Fellows," Berkeley EECS, December 2017. https://eecs.berkeley.edu/news/pieter-abbeel-and-sanjit-seshia-elected-2018-ieee-fellows/
"Berkeley robot learning pioneer Pieter Abbeel wins ACM Prize in Computing," Berkeley Engineering, April 2022. https://engineering.berkeley.edu/news/2022/04/berkeley-robot-learning-pioneer-pieter-abbeel-wins-acm-prize-in-computing/
"Turnitin Acquires Gradescope," Berkeley EECS, October 2018. https://eecs.berkeley.edu/news/turnitin-acquires-gradescope/ ↩
"Amazon hires the founders of AI robotics startup Covariant," TechCrunch, August 31, 2024. https://techcrunch.com/2024/08/31/amazon-hires-the-founders-of-robotics-ai-startup-covariant/ ↩
"Amazon hires Covariant founders, inks AI licensing deal," GeekWire, August 30, 2024. https://www.geekwire.com/2024/amazon-hires-covariant-founders-inks-licensing-deal-with-robotics-ai-startup-in-latest-reverse-acquihire-deal/ ↩
"Listen to The Robot Brains, Pieter Abbeel's New Podcast," CHAI, March 2021. https://humancompatible.ai/news/2021/03/31/listen-to-the-robot-brains-pieter-abbeels-new-podcast/ ↩
Pieter Abbeel, Google Scholar profile. https://scholar.google.com/citations?user=vtwH6GkAAAAJ ↩
Pieter Abbeel, Amazon Science author page. https://www.amazon.science/author/pieter-abbeel
Jonathan Ho, Ajay Jain, and Pieter Abbeel, "Denoising Diffusion Probabilistic Models," NeurIPS 2020. https://arxiv.org/abs/2006.11239 ↩
"Amazon AI chief Rohit Prasad leaving; Infrastructure exec Peter DeSantis to lead unified AI group," GeekWire, December 2025. https://www.geekwire.com/2025/amazon-ai-chief-rohit-prasad-leaving-infrastructure-exec-peter-desantis-to-lead-unified-ai-group/ ↩
"Covariant Introduces RFM-1 to Give Robots the Human-like Ability to Reason," Covariant, March 11, 2024. https://covariant.ai/covariant-introduces-rfm-1-to-give-robots-the-human-like-ability-to-reason/ ↩

Improve this article

Add missing citations, update stale details, or suggest a clearer explanation. Every suggestion is reviewed for sourcing before it goes live.

3 revisions by 1 contributors · full history

Suggest edit

Pieter Abbeel

Infobox

Early life and education

Academic career at Berkeley

Career timeline

Research contributions

Apprenticeship learning and inverse reinforcement learning

Trust Region Policy Optimization (TRPO)

Generalized Advantage Estimation (GAE)

Proximal Policy Optimization (PPO)

Hindsight Experience Replay (HER)

Domain randomization

Model-Agnostic Meta-Learning (MAML)

Denoising Diffusion Probabilistic Models (DDPM)

Decision Transformer

Other notable work

What did Pieter Abbeel do at OpenAI?

What is Covariant?

Funding history

What was the Amazon licensing and hiring deal (August 2024)?

What is Gradescope?

Investments and advisory roles

Teaching

Notable students and postdocs

Awards and honors

Public engagement

Personal life

Influence and legacy

See also

References

Improve this article

What links here

What links here

Infobox

Early life and education

Academic career at Berkeley

Career timeline

Research contributions

Apprenticeship learning and inverse reinforcement learning

Trust Region Policy Optimization (TRPO)

Generalized Advantage Estimation (GAE)

Proximal Policy Optimization (PPO)

Hindsight Experience Replay (HER)

Domain randomization

Model-Agnostic Meta-Learning (MAML)

Denoising Diffusion Probabilistic Models (DDPM)

Decision Transformer

Other notable work

What did Pieter Abbeel do at OpenAI?

What is Covariant?

Funding history

What was the Amazon licensing and hiring deal (August 2024)?

What is Gradescope?

Investments and advisory roles

Teaching

Notable students and postdocs

Awards and honors

Public engagement

Personal life

Influence and legacy

See also

References

Improve this article

Related Articles

Sergey Levine

John Schulman

David Silver

Richard S. Sutton

Andrew Barto

Misha Laskin

What links here

Related Articles

Sergey Levine

John Schulman

David Silver

Richard S. Sutton

Andrew Barto

Misha Laskin

What links here