Sergey Levine

People Reinforcement Learning Robotics

21 min read

Updated Jun 23, 2026

Suggest edit History Talk

RawGraph

Last edited

Jun 23, 2026

Fact-checked

In review queue

Sources

20 citations

Revision

v2 · 4,127 words

Fact-checks are independent of edits: a reviewer re-verifies the article against its sources and stamps the date. How we verify

Sergey Levine is an American computer scientist, associate professor of electrical engineering and computer sciences at the University of California, Berkeley, and a co-founder of Physical Intelligence, a San Francisco startup building general-purpose foundation models for robotics. He is one of the most cited researchers in deep reinforcement learning and robot learning: his Google Scholar profile lists more than 255,000 total citations, an h-index of 204, and an i10-index of 610 as of mid-2026 ^[4]. Levine directs Berkeley's Robotic AI and Learning (RAIL) Lab and is widely associated with the modern push to apply large neural networks to real-world robot control, from reinforcement learning algorithms such as soft actor-critic to the pi-zero family of vision-language-action models.

Levine joined Berkeley's Department of Electrical Engineering and Computer Sciences in fall 2016 after a PhD at Stanford University and a postdoc at Berkeley ^[1]^[2]. He is a member of Berkeley AI Research (BAIR), and his graduate course CS285 (Deep Reinforcement Learning) is a standard reference for the field through its publicly available video lectures ^[18]. His group has produced influential algorithms in policy search, off-policy reinforcement learning, offline reinforcement learning, meta-learning, and imitation learning.

In March 2024 Levine helped found Physical Intelligence with Karol Hausman, Chelsea Finn, Brian Ichter, Lachy Groom, Adnan Esmail, and Quan Vuong ^[19]. The company released the pi-0 model in October 2024, an open-weights vision-language-action model for general robot control ^[8]^[10], and raised a $400 million Series A in November 2024 at a $2.4 billion post-money valuation in a round led by Jeff Bezos and Thrive Capital ^[5]^[6]. By November 2025 the company had raised a further $600 million at a $5.6 billion valuation in a round led by Alphabet's growth fund CapitalG ^[7].

Infobox


Full name	Sergey Vladimir Levine
Citizenship	United States
Education	Stanford University (BS, MS, 2009; PhD, 2014)
Doctoral advisor	Vladlen Koltun
Doctoral thesis	Motor Skill Learning with Local Trajectory Methods (2014)
Postdoctoral advisor	Pieter Abbeel (UC Berkeley)
Known for	Guided policy search; soft actor-critic; conservative Q-learning; QT-Opt; pi-0 / pi-0.5; CS285 Deep RL course
Fields	Machine learning, reinforcement learning, robotics, computer vision
Citations	255,000+ (h-index 204, i10-index 610), Google Scholar, mid-2026
Institutions	UC Berkeley (2016 to present); Physical Intelligence (2024 to present); Google Brain (research scientist, 2015 to 2016)
Lab	Robotic AI and Learning (RAIL) Lab; Berkeley AI Research (BAIR)
Notable students	Chelsea Finn, Aviral Kumar, Tuomas Haarnoja, Karol Hausman, Abhishek Gupta, Karl Pertsch
Notable awards	Sloan Research Fellowship (2019); Presidential Early Career Award for Scientists and Engineers (2024)
Website	people.eecs.berkeley.edu/~svlevine

Early life and education

Levine was raised in the United States and developed an early interest in computer graphics and animation, which led him to combine work on character animation with machine learning during his undergraduate years at Stanford University. He received both a Bachelor of Science and a Master of Science in computer science from Stanford in 2009.

He stayed at Stanford for doctoral work in the same department, joining the research group of Vladlen Koltun, then a faculty member in the Stanford computer science and graphics communities. His dissertation, titled "Motor Skill Learning with Local Trajectory Methods," was completed in 2014 ^[3]. The thesis introduced and developed the family of guided policy search algorithms, which decompose policy learning by alternating between trajectory optimization (used to handle exploration and to provide good local solutions) and supervised policy learning that distills the trajectories into a parametric policy. The framework allowed deep neural network policies to be trained with relatively low sample complexity by exploiting model knowledge or trajectory optimizers as teachers, an approach that proved influential in subsequent robot learning research.

Following his PhD, Levine moved to Berkeley as a postdoctoral researcher in the lab of Pieter Abbeel, who at that time was a leading figure in apprenticeship learning and deep reinforcement learning for robotics. During the postdoc he extended guided policy search to vision-based control of real robotic arms, producing the well known 2016 paper "End-to-End Training of Deep Visuomotor Policies," which showed that convolutional neural network policies could be trained directly from camera pixels and joint encoders to perform contact-rich manipulation tasks like screwing on a bottle cap or hanging a clothes hanger ^[3].

For approximately one year before joining the Berkeley faculty, Levine was a research scientist on the Google Brain team in Mountain View, where he continued work on large-scale robot learning and contributed to the early Brain robotics research that eventually fed into Google's broader robotics learning agenda.

Academic career

Levine joined the Department of Electrical Engineering and Computer Sciences at UC Berkeley as an assistant professor in fall 2016 and was later promoted to associate professor with tenure ^[1]. He is a member of Berkeley AI Research (BAIR) and the director of the Robotic AI and Learning (RAIL) Lab, which has been one of the most prolific groups in deep reinforcement learning and robot learning of the past decade.

The RAIL group's research agenda spans algorithms for online and offline reinforcement learning, learning from demonstrations, large-scale robot data collection, foundation models for control, and the use of large pretrained models from vision and language as priors for embodied agents. Many of the methods that became standard tools in the deep RL community, including soft actor-critic, conservative Q-learning, model agnostic meta-learning, and the modern formulation of behavior cloning baselines for offline RL, have roots in the RAIL Lab and its collaborators.

Levine is widely known in the field for the publicly available course materials of CS285 (formerly CS294-112), Berkeley's graduate course on deep reinforcement learning. Lecture videos for the course are released each year on YouTube, and the slides and homework assignments are posted on the course website at rail.eecs.berkeley.edu/deeprlcourse ^[18]. The course covers policy gradients, value-based methods, actor-critic algorithms, model-based RL, exploration, inverse RL, meta-learning, and offline reinforcement learning, and is widely used by graduate students and self-learners outside Berkeley as a primary introduction to the field.

His administrative and service roles include program committee work and area chair positions for NeurIPS, ICML, ICLR, and the Conference on Robot Learning (CoRL), which he helped shape during its early years as one of the venues most closely associated with the deep robot learning community.

How cited is Sergey Levine?

Levine is among the most cited researchers in artificial intelligence. As of mid-2026 his Google Scholar profile reports 255,563 total citations, an h-index of 204, and an i10-index of 610 ^[4]. His three most-cited works are "Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks" (MAML, 2017) with roughly 19,800 citations, "Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor" (2018) with roughly 16,200 citations, and "Trust Region Policy Optimization" (2015) with roughly 12,000 citations ^[4]. An h-index of 204 means that at least 204 of his papers have each been cited 204 or more times.

Industry roles

Google Brain

Before joining Berkeley, Levine spent approximately a year as a full-time research scientist on the Google Brain team. After moving to Berkeley, he continued as a part-time visiting researcher and external collaborator with Google Brain (and later Google DeepMind) for several years, contributing to large-scale robot learning efforts including arm farms (the Google robot manipulation collective for self-supervised grasping data collection) and the RT (Robotics Transformer) line of work that built up to RT-1 and RT-2 ^[14]^[15].

Physical Intelligence

In March 2024, Levine co-founded Physical Intelligence (often abbreviated PI or written as the Greek letter pi) in San Francisco ^[19]. The founding team included Karol Hausman (CEO, formerly staff research scientist at Google DeepMind and adjunct faculty at Stanford), Chelsea Finn (Stanford associate professor and former Levine student), Brian Ichter (formerly Google Brain robotics), Adnan Esmail, Quan Vuong (formerly Google DeepMind), and Lachy Groom (former Stripe executive turned investor, who serves as president). Levine's role at the company is research-focused, often described as co-founder and a leader of model design.

The company states: "Our mission at Physical Intelligence is to develop foundation models that can control any robot to perform any task" ^[10]. Its first publicly released model, pi-0 ("pi-zero"), was announced in October 2024 ^[8]^[10].

Physical Intelligence raised an early seed round of approximately $70 million led by Thrive Capital in early 2024, then closed a $400 million Series A in November 2024 at a $2.4 billion post-money valuation ^[5]^[6]. The Series A was led by Jeff Bezos (through his personal investment vehicle) and Thrive Capital, with participation from Lux Capital, Bond, Redpoint Ventures, Sequoia Capital, and OpenAI ^[6]. CNBC and PitchBook reported the round as one of the largest early-stage AI rounds of 2024 outside of pure language model labs ^[5]^[6].

In November 2025, Physical Intelligence raised an additional $600 million at a $5.6 billion valuation in a round led by Alphabet's independent growth fund CapitalG, with participation from existing backers including Thrive Capital, Lux Capital, and Jeff Bezos, plus new investors Index Ventures and T. Rowe Price ^[7]. The company's total funding reached roughly $1.1 billion. Levine has continued to publish under the Berkeley affiliation while leading model design work at PI.

Research contributions

Levine's research spans algorithms, systems, and applications of machine learning to decision making and control. The list below covers his most influential lines of work.

Guided policy search and end-to-end visuomotor learning

The earliest body of work for which Levine is well known is guided policy search (GPS), introduced in his 2013 ICML paper with Vladlen Koltun and developed extensively in his Stanford thesis ^[3]. GPS reframes policy search as a constrained optimization that alternates between (a) using trajectory optimization or model-based RL to produce locally optimal trajectory distributions and (b) supervised learning to fit a global parametric policy to those trajectories. The method enabled training deep neural network policies with much lower sample complexity than naive policy gradient methods.

The most cited extension of GPS is the 2016 paper "End-to-End Training of Deep Visuomotor Policies," which trained convolutional neural network policies on physical PR2 robots to perform manipulation tasks directly from camera pixels and joint state ^[3]. This work is widely credited as one of the first demonstrations that deep convolutional policies could be trained end-to-end for real-world robot control.

Soft actor-critic

With his PhD student Tuomas Haarnoja and others, Levine introduced soft actor-critic (SAC) in 2018 ^[11]. SAC is an off-policy deep reinforcement learning algorithm based on the maximum entropy RL framework, in which, in the authors' words, "the actor aims to maximize expected reward while also maximizing entropy. That is, to succeed at the task while acting as randomly as possible" ^[11]. The method combined sample efficiency, stable convergence, and strong performance on continuous control benchmarks, and rapidly became the default off-policy continuous-action algorithm in academic deep RL. With roughly 16,200 citations ^[4], SAC remains, alongside proximal policy optimization, one of the two most widely used baseline RL algorithms for continuous control problems.

Conservative Q-learning and offline RL

With Aviral Kumar, Aurick Zhou, and George Tucker, Levine introduced Conservative Q-Learning (CQL) in 2020 ^[12]. CQL addresses the central pathology of offline reinforcement learning (also called batch RL): when an RL agent learns from a fixed dataset, the standard Bellman update systematically overestimates the values of actions that the dataset does not cover, leading to disastrous policies at deployment. CQL adds a regularizer to the standard Q-function update that pushes Q-values down on out-of-distribution actions, producing a conservative lower bound on the true policy value. The method became one of the most widely used baselines and reference algorithms in the offline RL literature, alongside the closely related implicit Q-learning (IQL) and behavior-regularized methods that the Levine group also helped develop.

Levine and his students also co-authored the widely read survey "Offline Reinforcement Learning: Tutorial, Review, and Perspectives on Open Problems" (2020), which became a standard entry point to the subfield. They became among the most prominent advocates for offline RL as a practical paradigm for robot learning, on the argument that pretraining on previously collected experience and then fine-tuning online is far more sample-efficient than learning everything from scratch in the real world.

Meta-learning and imitation learning

With Chelsea Finn (then his PhD student) and Pieter Abbeel, Levine co-authored the 2017 paper "Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks" (MAML) ^[13], one of the most cited papers in modern meta-learning with roughly 19,800 citations and his single most-cited work ^[4]. MAML proposed an optimization-based meta-learning algorithm that learns an initialization for a neural network such that a few gradient steps on a new task produce strong performance, with no architectural assumptions about the model. MAML has been widely adopted in few-shot classification, reinforcement learning, and robot learning.

Levine's group has also been a major contributor to modern imitation learning, including goal-conditioned behavior cloning (GCBC), inverse reinforcement learning, and large-scale demonstration-based robot pretraining. Many of these techniques later fed into the design of vision-language-action models such as RT-1, RT-2, and pi-0.

Scalable robotic RL: QT-Opt

In 2018 Levine and collaborators at Google introduced QT-Opt, a scalable, distributed Q-learning system for vision-based robotic grasping. QT-Opt was trained on more than 580,000 real grasp attempts collected across a fleet of robotic arms and reached a 96 percent grasp success rate on previously unseen objects, a result frequently cited as evidence that off-policy RL could scale to real-world, vision-based manipulation. The work demonstrated that combining large-scale real-robot data collection with off-policy reinforcement learning could produce closed-loop grasping policies that generalize to novel objects.

Vision-language-action models

With collaborators at Google DeepMind, Levine co-authored RT-1 ("Robotics Transformer for Real-World Control at Scale") in 2022 ^[14] and RT-2 ("Vision-Language-Action Models Transfer Web Knowledge to Robotic Control") in 2023 ^[15]. RT-1 demonstrated that a transformer trained on a large multi-task robot dataset could control real robotic arms across hundreds of tasks. RT-2 went further, fine-tuning a pretrained vision-language model so that it could output discretized action tokens, allowing it to inherit semantic generalization from internet-scale data.

At Physical Intelligence, this line of work culminated in the pi-0 model released in October 2024 and described in the paper "pi-0: A Vision-Language-Action Flow Model for General Robot Control" (arXiv:2410.24164) ^[8]. The paper describes "a novel flow matching architecture built on top of a pre-trained vision-language model (VLM) to inherit Internet-scale semantic knowledge" ^[8]. pi-0 was trained on data from many different robot platforms, including single-arm robots, dual-arm robots, and mobile manipulators, and was demonstrated on long-horizon manipulation tasks the paper lists as "laundry folding, table cleaning, and assembling boxes" ^[8]. The follow-up pi-0.5 model, released in 2025 and described in arXiv:2504.16054 ("pi-0.5: a Vision-Language-Action Model with Open-World Generalization") ^[9], extended pi-0 with co-training on heterogeneous task data, including web data and high-level semantic prediction, in order to obtain better open-world generalization.

Foundation models for robotics

Levine has been a vocal advocate of the broader thesis that robotics needs its own foundation models, trained on broad and diverse data and then specialized to individual platforms. His group at Berkeley participated in the Open X-Embodiment effort, a multi-institution dataset and model release that aggregated robot demonstration data from more than twenty academic and industry labs to train general cross-embodiment policies. The work demonstrated that policies trained on combined cross-platform data outperformed policies trained on data from any single platform, providing empirical support for the foundation-model paradigm in robot learning.

Notable papers

Year	Title	Venue	Approx. citations
2013	Guided Policy Search	ICML	1,400+
2015	Trust Region Policy Optimization (co-author with J. Schulman et al.)	ICML	12,000+
2016	End-to-End Training of Deep Visuomotor Policies	JMLR	4,500+
2017	Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks (with C. Finn, P. Abbeel)	ICML	19,800+
2018	Soft Actor-Critic: Off-Policy Maximum Entropy Deep RL with a Stochastic Actor (with T. Haarnoja et al.)	ICML	16,200+
2018	Soft Actor-Critic Algorithms and Applications	arXiv	4,500+
2018	QT-Opt: Scalable Deep Reinforcement Learning for Vision-Based Robotic Manipulation	CoRL	1,300+
2020	Conservative Q-Learning for Offline Reinforcement Learning (with A. Kumar et al.)	NeurIPS	2,800+
2020	Offline Reinforcement Learning: Tutorial, Review, and Perspectives on Open Problems	arXiv	2,000+
2021	Implicit Behavioral Cloning	CoRL	700+
2022	RT-1: Robotics Transformer for Real-World Control at Scale	RSS	1,000+
2023	RT-2: Vision-Language-Action Models Transfer Web Knowledge to Robotic Control	CoRL	800+
2023	Open X-Embodiment: Robotic Learning Datasets and RT-X Models (multi-institution)	ICRA	600+
2024	pi-0: A Vision-Language-Action Flow Model for General Robot Control	Physical Intelligence preprint (arXiv:2410.24164)	250+
2025	pi-0.5: a Vision-Language-Action Model with Open-World Generalization	arXiv:2504.16054	growing

Citation counts are approximate and based on Google Scholar and Semantic Scholar as of mid-2026 ^[4]; values change over time. The MAML, SAC, and TRPO figures are taken directly from the per-paper counts on Levine's Google Scholar profile ^[4].

Teaching

Levine teaches CS285 (formerly CS294-112), "Deep Reinforcement Learning," at Berkeley ^[18]. The course is one of the most widely watched graduate machine learning courses outside the classroom: each year's lecture videos are uploaded to YouTube, and the course materials are open-licensed on the official course website. The syllabus covers:

Module	Topics
Foundations	Markov decision processes, value functions, policy evaluation
Policy gradients	REINFORCE, actor-critic, baselines, natural and trust region methods
Q-learning	DQN, double DQN, distributional RL
Actor-critic	DDPG, TD3, soft actor-critic
Model-based RL	Dynamics modeling, MPC, latent-state world models
Exploration	Intrinsic motivation, count-based exploration
Inverse RL and imitation	Behavioral cloning, GAIL, MaxEnt IRL
Offline RL	BCQ, CQL, IQL, behavior-regularized methods
Meta-RL and multi-task RL	MAML, PEARL
Foundation models for control	Vision-language-action models, scaling laws

In addition to CS285, Levine has co-taught and lectured in tutorials at NeurIPS, ICML, and CoRL on topics including offline RL, learned models for control, and foundation models for robotics.

Notable students and collaborators

The RAIL Lab at Berkeley has produced a large number of researchers who hold faculty positions or senior industry research roles. The table below covers a representative sample (jointly advised students are listed here when Levine was the primary or co-primary advisor).

Person	Role at graduation	Current position
Chelsea Finn	PhD, UC Berkeley (jointly with Pieter Abbeel), 2018	Associate professor, Stanford; co-founder, Physical Intelligence
Tuomas Haarnoja	PhD, UC Berkeley, 2018	Senior research scientist, Google DeepMind
Aviral Kumar	PhD, UC Berkeley, 2023	Assistant professor, Carnegie Mellon University; research scientist at Google DeepMind
Karol Hausman	Postdoc / collaborator (PhD from USC, 2018)	CEO and co-founder, Physical Intelligence
Abhishek Gupta	PhD, UC Berkeley, 2021	Assistant professor, University of Washington
Karl Pertsch	Postdoc, UC Berkeley	Research scientist, Physical Intelligence
Justin Fu	PhD, UC Berkeley	Research scientist, Google DeepMind
Coline Devin	PhD, UC Berkeley	Research scientist, Google DeepMind
Marvin Zhang	PhD, UC Berkeley	Industry research
Anusha Nagabandi	PhD, UC Berkeley	Co-founder, Covariant AI
Vitchyr Pong	PhD, UC Berkeley	Industry research
Kelvin Xu	PhD, UC Berkeley	Industry research

The RAIL group also collaborates extensively with Pieter Abbeel's BAIR group on shared infrastructure and joint students, and historically with Trevor Darrell, Ken Goldberg, and other Berkeley robotics and computer vision faculty.

Recognition and awards

What awards has Sergey Levine received? His honors span early-career and mid-career recognition in machine learning and robotics, capped by the 2024 Presidential Early Career Award for Scientists and Engineers (PECASE), described by the U.S. government as the highest honor it bestows on scientists and engineers in the early stages of their careers ^[16].

Year	Award
2014	Stanford Computer Science PhD (dissertation "Motor Skill Learning with Local Trajectory Methods")
2016	NSF CAREER Award
2018	Okawa Foundation Research Grant
2019	Sloan Research Fellowship
2024	Presidential Early Career Award for Scientists and Engineers (PECASE)

Levine has also been recognized through best paper and best paper finalist nominations at venues including ICRA, ICLR, NeurIPS, and CoRL, as well as on community lists of top-cited or most-influential authors in artificial intelligence.

Selected works

The following bibliography lists representative works that span the major lines of research in Levine's career. They are referenced inline in the article above by year and short title.

Sergey Levine and Vladlen Koltun. "Guided Policy Search." Proceedings of the 30th International Conference on Machine Learning (ICML), 2013.
Sergey Levine. "Motor Skill Learning with Local Trajectory Methods." PhD dissertation, Stanford University, 2014.
Sergey Levine, Chelsea Finn, Trevor Darrell, and Pieter Abbeel. "End-to-End Training of Deep Visuomotor Policies." Journal of Machine Learning Research, 2016.
Chelsea Finn, Pieter Abbeel, and Sergey Levine. "Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks." ICML, 2017.
Tuomas Haarnoja, Aurick Zhou, Pieter Abbeel, and Sergey Levine. "Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor." ICML, 2018.
Dmitry Kalashnikov, Alex Irpan, Peter Pastor, et al. (with Sergey Levine). "QT-Opt: Scalable Deep Reinforcement Learning for Vision-Based Robotic Manipulation." CoRL, 2018.
Aviral Kumar, Aurick Zhou, George Tucker, and Sergey Levine. "Conservative Q-Learning for Offline Reinforcement Learning." NeurIPS, 2020.
Sergey Levine, Aviral Kumar, George Tucker, and Justin Fu. "Offline Reinforcement Learning: Tutorial, Review, and Perspectives on Open Problems." arXiv:2005.01643, 2020.
Anthony Brohan et al. "RT-1: Robotics Transformer for Real-World Control at Scale." arXiv:2212.06817 / Robotics: Science and Systems, 2023.
Anthony Brohan et al. "RT-2: Vision-Language-Action Models Transfer Web Knowledge to Robotic Control." arXiv:2307.15818, 2023.
Kevin Black, Noah Brown, Danny Driess, Adnan Esmail, Michael Equi, Chelsea Finn, Karol Hausman, Brian Ichter, Sergey Levine, et al. "pi-0: A Vision-Language-Action Flow Model for General Robot Control." arXiv:2410.24164, 2024.
Physical Intelligence. "pi-0.5: a Vision-Language-Action Model with Open-World Generalization." arXiv:2504.16054, 2025.

References

"Sergey Levine," EECS at UC Berkeley faculty homepage, https://www2.eecs.berkeley.edu/Faculty/Homepages/svlevine.html ↩
"Sergey Levine," personal homepage, UC Berkeley EECS, https://people.eecs.berkeley.edu/~svlevine/ ↩
Sergey Levine, "Motor Skill Learning with Local Trajectory Methods," PhD dissertation, Stanford University, 2014, https://people.eecs.berkeley.edu/~svlevine/papers/thesis.pdf ↩
"Sergey Levine," Google Scholar profile (255,563 citations; h-index 204; i10-index 610 as of mid-2026), https://scholar.google.com/citations?user=8R35rCwAAAAJ ↩
"Robotics AI startup Physical Intelligence raises $400M in Bezos, Thrive-led round," PitchBook News, November 4, 2024, https://pitchbook.com/news/articles/robotics-ai-physical-intelligence-bezos ↩
"Jeff Bezos and OpenAI invest in robot startup Physical Intelligence at $2.4 billion valuation," CNBC, November 4, 2024, https://www.cnbc.com/2024/11/04/jeff-bezos-and-openai-invest-in-robot-startup-physical-intelligence.html ↩
"Robotics Startup Physical Intelligence Valued at $5.6 Billion in New Funding," Bloomberg, November 20, 2025, https://www.bloomberg.com/news/articles/2025-11-20/robotics-startup-physical-intelligence-valued-at-5-6-billion-in-new-funding ↩
Kevin Black et al., "pi-0: A Vision-Language-Action Flow Model for General Robot Control," arXiv:2410.24164, October 31, 2024, https://arxiv.org/abs/2410.24164 ↩
Physical Intelligence, "pi-0.5: a Vision-Language-Action Model with Open-World Generalization," arXiv:2504.16054, 2025, https://arxiv.org/abs/2504.16054 ↩
"pi-0: Our First Generalist Policy," Physical Intelligence blog, October 2024, https://www.pi.website/blog/pi0 ↩
Tuomas Haarnoja, Aurick Zhou, Pieter Abbeel, Sergey Levine, "Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor," arXiv:1801.01290, ICML 2018, https://arxiv.org/abs/1801.01290 ↩
Aviral Kumar, Aurick Zhou, George Tucker, Sergey Levine, "Conservative Q-Learning for Offline Reinforcement Learning," arXiv:2006.04779, NeurIPS 2020, https://arxiv.org/abs/2006.04779 ↩
Chelsea Finn, Pieter Abbeel, Sergey Levine, "Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks," arXiv:1703.03400, ICML 2017, https://arxiv.org/abs/1703.03400 ↩
Anthony Brohan et al., "RT-1: Robotics Transformer for Real-World Control at Scale," arXiv:2212.06817, 2022, https://arxiv.org/abs/2212.06817 ↩
Anthony Brohan et al., "RT-2: Vision-Language-Action Models Transfer Web Knowledge to Robotic Control," arXiv:2307.15818, 2023, https://arxiv.org/abs/2307.15818 ↩
"White House honors engineering faculty with early career awards," Berkeley Engineering, January 2025, https://engineering.berkeley.edu/news/2025/01/white-house-honors-engineering-faculty-with-early-career-awards-2/ ↩
"EECS professors win Sloan Fellowship," Berkeley Engineering, February 2019, https://engineering.berkeley.edu/news/2019/02/eecs-professors-win-sloan-fellowship/
CS 285 Deep Reinforcement Learning course website, UC Berkeley, https://rail.eecs.berkeley.edu/deeprlcourse/ ↩
"Sergey Levine Co-Founds Physical Intelligence: Pioneering AI-Powered Robots," ODSC / Open Data Science, 2024, https://opendatascience.com/sergey-levine-co-founds-physical-intelligence-pioneering-ai-powered-robots/ ↩
"Robotics startup Physical Intelligence raises $600M at $5.6B valuation," The Robot Report, November 21, 2025, https://www.therobotreport.com/physical-intelligence-raises-600m-advance-robot-foundation-models/

Improve this article

Add missing citations, update stale details, or suggest a clearer explanation. Every suggestion is reviewed for sourcing before it goes live.

1 revision by 1 contributors · full history

Suggest edit

What links here

ALOHA 2 Action Chunking with Transformers (ACT)Chelsea Finn Control theory Karol Hausman PaLM-E: An Embodied Multimodal Language Model Pieter Abbeel Robotics Soft Actor-Critic University of California, Berkeley π0

Infobox

Early life and education

Academic career

How cited is Sergey Levine?

Industry roles

Google Brain

Physical Intelligence

Research contributions

Guided policy search and end-to-end visuomotor learning

Soft actor-critic

Conservative Q-learning and offline RL

Meta-learning and imitation learning

Scalable robotic RL: QT-Opt

Vision-language-action models

Foundation models for robotics

Notable papers

Teaching

Notable students and collaborators

Recognition and awards

Selected works

See also

References

Improve this article

Related Articles

Pieter Abbeel

John Schulman

David Silver

Richard S. Sutton

Andrew Barto

Misha Laskin

What links here

Related Articles

Pieter Abbeel

John Schulman

David Silver

Richard S. Sutton

Andrew Barto

Misha Laskin

What links here