Yoshua Bengio
Last reviewed
May 24, 2026
Sources
No citations yet
Review status
Needs citations
Revision
v6 ยท 7,249 words
Improve this article
Add missing citations, update stale details, or suggest a clearer explanation.
Last reviewed
May 24, 2026
Sources
No citations yet
Review status
Needs citations
Revision
v6 ยท 7,249 words
Add missing citations, update stale details, or suggest a clearer explanation.
| Yoshua Bengio | |
|---|---|
| Born | March 5, 1964 (age 62), Paris, France |
| Nationality | Canadian, French |
| Education | B.Eng. Computer Engineering, McGill University (1986); M.Sc. Computer Science, McGill University (1988); Ph.D. Computer Science, McGill University (1991) |
| Doctoral advisor | Renato De Mori (McGill University) |
| Known for | Deep learning, neural probabilistic language models, attention mechanism, GFlowNets, AI safety advocacy |
| Institutions | Universite de Montreal (1993-present), Mila (founder and scientific advisor), LawZero (co-president and scientific director) |
| Awards | Turing Award (2018), Queen Elizabeth Prize for Engineering (2025), Princess of Asturias Award (2022), VinFuture Grand Prize (2024), Killam Prize (2019), Officer of the Order of Canada, Knight of the Legion of Honor of France |
| Notable work | Deep Learning (textbook, 2016), "A Neural Probabilistic Language Model" (2003), "Neural Machine Translation by Jointly Learning to Align and Translate" (2014), "Learning long-term dependencies with gradient descent is difficult" (1994) |
Yoshua Bengio (born March 5, 1964) is a Canadian computer scientist of Moroccan-Sephardic origin, widely recognized as one of the pioneers of deep learning and artificial intelligence. He is a Full Professor in the Department of Computer Science and Operations Research at the Universite de Montreal, the founder and scientific advisor of Mila (the Quebec Artificial Intelligence Institute), and the co-president and scientific director of LawZero, a nonprofit focused on safe AI development.[1] Along with Geoffrey Hinton and Yann LeCun, he received the 2018 ACM A.M. Turing Award for "conceptual and engineering breakthroughs that have made deep neural networks a critical component of computing."[2]
Bengio is the most-cited computer scientist in the world by total citations and by h-index, and the most-cited living scientist across all fields by total citations.[1] In late October 2025, his Google Scholar profile passed one million career citations, a threshold previously crossed only by French philosopher Michel Foucault among scholars tracked by the service.[3] His research contributions span neural probabilistic language models, the attention mechanism for neural machine translation, generative adversarial networks, denoising autoencoders, and generative flow networks (GFlowNets). Since 2023, he has become one of the most prominent academic voices calling for the regulation and technical alignment of frontier AI, chairing the International AI Safety Report and founding LawZero to pursue a "safe-by-design" research program centred on a non-agentic "Scientist AI."[4][5]
Yoshua Bengio was born on March 5, 1964, in Paris, France, to a Sephardic Jewish family that had emigrated from Morocco. His father, Carlo Bengio, was a pharmacist and playwright who ran a Sephardic theatre company in Montreal that performed pieces in Judeo-Arabic. His mother, Celia Moreno, was an actress in the 1970s Moroccan theatre scene led by Tayeb Seddiki; she later studied economics in Paris and co-founded l'Ecran humain, a multimedia theatre troupe, in Montreal in 1980. Both parents had rejected aspects of their traditional Moroccan Jewish upbringings and embraced the 1960s counterculture's emphasis on personal freedom and social solidarity, and their interests leaned more toward the arts than the sciences.[6] During the late 1960s, the family spent a year in Morocco while Bengio's father completed military service.[6]
The family relocated to Montreal, Canada, when Yoshua was around twelve years old. He and his younger brother Samy, who would also become a prominent computer scientist (now senior director of AI and machine learning research at Apple), shared an early fascination with computing. The two brothers pooled money they had earned from newspaper delivery routes to purchase an Apple II and an Atari 800, machines that sparked their lifelong careers.[6] Yoshua taught himself the BASIC programming language on the Atari and later moved to assembly programming, while Samy concentrated on hardware.[6]
Bengio completed all of his higher education at McGill University in Montreal. He earned a Bachelor of Engineering in computer engineering in 1986, followed by a Master of Science in computer science in 1988, and a Doctor of Philosophy in computer science in 1991.[7] His doctoral research, supervised by Renato De Mori, focused on artificial neural networks and hidden Markov models for speech recognition, a topic at the time considered unfashionable by much of the academic computer science community.[7] De Mori had been working on speech recognition for years and was beginning to move from classical AI methods to statistical and connectionist approaches, which made him a rare match for Bengio's interest in neural networks during the second AI winter.[6]
From 1991 to 1992 Bengio was a postdoctoral fellow at the Massachusetts Institute of Technology, working in Michael I. Jordan's group on statistical learning, recurrent neural networks and hidden Markov models.[7] He then spent 1992 to 1993 as a postdoctoral fellow at AT&T Bell Laboratories in Holmdel, New Jersey, where he collaborated with Larry Jackel, Yann LeCun, Vladimir Vapnik and others. His work at Bell Labs combined neural networks with probabilistic models of sequences and contributed to an automatic bank-cheque-reading system later deployed commercially by AT&T.[7] Bengio joined the Universite de Montreal as a faculty member in September 1993, where he has remained for more than three decades.[8]
Bengio has been a professor in the Department of Computer Science and Operations Research (Departement d'informatique et de recherche operationnelle, DIRO) at the Universite de Montreal since 1993, becoming a Full Professor in 2002.[8] Throughout the 1990s and 2000s, he continued to develop and advocate for neural network approaches at a time when many in the AI research community had turned away from them in favor of other machine learning methods such as support vector machines and kernel methods. His persistence, alongside that of Hinton and LeCun, was instrumental in the eventual resurgence of neural networks and the deep learning revolution that would reshape the field.[2]
Bengio holds the Canada CIFAR AI Chair and co-directs CIFAR's Learning in Machines and Brains program (formerly Neural Computation and Adaptive Perception) together with Yann LeCun, a program credited as a foundational platform for the development of deep learning in Canada.[9] He served as program chair for NeurIPS 2008 and general chair for NeurIPS 2009, and in 2012 he co-founded the International Conference on Learning Representations (ICLR) with Yann LeCun. The first ICLR was held in Scottsdale, Arizona in May 2013, and the conference introduced an open peer-review model that has since become a reference for the field.[10]
In 1993, alongside his arrival at the Universite de Montreal, Bengio founded the research group originally known as the Laboratoire d'informatique des systemes adaptatifs (LISA), which would grow into Mila over the following decades.[11] The lab's early efforts centred on algorithms for adaptive systems, recurrent neural networks for sequential data, and energy-based models for unsupervised learning. In 2017, the lab was restructured as Mila, the Montreal Institute for Learning Algorithms, formed as a partnership among the Universite de Montreal, McGill University, Polytechnique Montreal, and HEC Montreal, and acquired non-profit status in 2018. The institute later adopted the broader brand name Mila, Quebec Artificial Intelligence Institute.[11] By 2022, Mila comprised roughly 1,000 students and researchers and over 100 affiliated faculty members, making it one of the world's largest academic concentrations of deep learning expertise.[11]
Bengio served as Mila's scientific director from its founding until March 28, 2025, when he transitioned to a newly created role of "Founder and Scientific Advisor" so that he could devote more time to AI safety research and international governance work.[12] Laurent Charlin served as interim scientific director until September 2, 2025, when Hugo Larochelle, a former head of Google DeepMind's Montreal lab and adjunct professor at the Universite de Montreal, was appointed Mila's permanent scientific director, succeeding Bengio.[13] Bengio continues at Mila as a Core Academic Member, member of its Scientific Council, and Canada CIFAR AI Chair.[12]
Mila has been a training ground for many of the researchers and engineers who went on to shape the modern AI industry at organizations like Google, Meta, Anthropic, OpenAI, and numerous startups, and is recognized as one of the world's largest academic research centres in deep learning and AI.[11]
In October 2016, Bengio co-founded Element AI, a Montreal-based artificial intelligence incubator, together with Jean-Francois Gagne, Anne Martel, Nicolas Chapados, Philippe Beaudoin and Jean-Sebastien Cournoyer of Real Ventures.[14] Element AI raised roughly US$102 million in a 2017 Series A round led by Data Collective with participation from Microsoft Ventures, Intel Capital and others, and became one of the highest-profile Canadian AI start-ups before being acquired by ServiceNow in November 2020. Bengio remained an advisor to ServiceNow following the acquisition.[14]
Bengio has also been a vocal architect of the broader Montreal and Quebec AI ecosystem. He contributed to the drafting of the Montreal Declaration for the Responsible Development of Artificial Intelligence, an ethics charter launched in 2018, and has frequently been described in the Canadian press as a co-architect of the federal Pan-Canadian AI Strategy administered through CIFAR.[15]
Bengio's research output has been exceptionally broad and influential. The table below summarizes several of his most important contributions.
| Contribution | Year(s) | Key collaborators | Significance |
|---|---|---|---|
| Vanishing gradients in RNNs | 1994 | Patrice Simard, Paolo Frasconi | Formalized why gradient descent fails on long-term dependencies in recurrent networks |
| Neural probabilistic language model | 2003 | Rejean Ducharme, Pascal Vincent, Christian Jauvin | Introduced word embeddings as distributed representations; foundational for all later language models |
| Denoising autoencoders | 2008 | Pascal Vincent, Hugo Larochelle, Pierre-Antoine Manzagol | Advanced unsupervised feature learning through corruption-based training |
| Curriculum learning | 2009 | Jerome Louradour, Ronan Collobert, Jason Weston | Proposed training neural networks on progressively harder examples |
| "Learning Deep Architectures for AI" | 2009 | (single-author monograph) | Survey that defined the modern deep-learning research agenda |
| Why unsupervised pretraining helps | 2010 | Erhan, Courville, Manzagol, Vincent, S. Bengio | Empirical and theoretical analysis of how pretraining shapes optimization |
| Attention mechanism for NMT | 2014-2015 | Dzmitry Bahdanau, Kyunghyun Cho | Enabled neural networks to focus on relevant parts of input sequences; precursor to the Transformer architecture |
| Generative Adversarial Networks | 2014 | Ian Goodfellow (lead), Aaron Courville, others | Co-authored the original GAN paper at NeurIPS 2014 |
| Nature Deep Learning review | 2015 | Yann LeCun, Geoffrey Hinton | Defining short review article in Nature signalling the maturity of the field |
| Deep Learning textbook | 2016 | Ian Goodfellow, Aaron Courville | The definitive textbook on deep learning, published by MIT Press |
| Generative Flow Networks (GFlowNets) | 2021-present | Emmanuel Bengio, Moksh Jain, Nikolay Malkin, others | A framework for diverse candidate generation combining ideas from reinforcement learning and probabilistic modelling |
| Scientist AI | 2025 | Michael Cohen, Soren Mindermann, others | Proposed non-agentic, Bayesian alternative to autonomous AI agents |
One of Bengio's earliest influential papers, "Learning Long-Term Dependencies with Gradient Descent Is Difficult," appeared in IEEE Transactions on Neural Networks in 1994, co-authored with his Bell Labs collaborator Patrice Simard and Paolo Frasconi.[16] The paper formally analysed why backpropagation through time fails to train recurrent networks on tasks that require remembering information across many time steps: the gradient of the loss with respect to early time-step parameters either decays exponentially toward zero (the vanishing gradient problem) or explodes, depending on the spectral radius of the recurrent weight matrix. This result became a standard reference for subsequent work, including the gated architectures, the Long Short-Term Memory network by Hochreiter and Schmidhuber, and later improvements like the gated recurrent unit (GRU), which was itself developed in Bengio's lab with Cho and Bahdanau.[16]
In 2003, Bengio, along with Rejean Ducharme, Pascal Vincent, and Christian Jauvin, published "A Neural Probabilistic Language Model" in the Journal of Machine Learning Research. The paper introduced the concept of learning distributed representations for words, now known as word embeddings, and using a neural network to predict the next word in a sequence based on the context provided by its predecessors.[17]
Before this work, language models relied primarily on n-gram statistics, which suffered from the curse of dimensionality: as vocabulary sizes and context lengths grew, the number of possible word combinations exploded, and most sequences in test data had never been seen during training. Bengio's approach addressed this by mapping each word to a continuous vector in a lower-dimensional space (typically a few dozen to a few hundred dimensions) and using a feed-forward neural network to compute the probability of the next word given the embedded context. Words that occurred in similar contexts ended up with similar vectors, so the model could generalize from seen sequences to similar unseen ones.[17]
This paper is now regarded as one of the foundational works that led to Word2Vec (2013), GloVe, ELMo, and the contextual embeddings used in modern large language models such as BERT and the GPT series. The paper has more than 14,000 Google Scholar citations and is routinely cited as the origin of neural language modelling.[17]
Beginning in 2006, Bengio's group at the Universite de Montreal published a series of papers showing that deep multilayer networks could be trained successfully by first performing greedy layer-wise unsupervised pretraining using restricted Boltzmann machines or autoencoders, and then fine-tuning with supervised gradient descent.[18] This work, alongside parallel results from Geoffrey Hinton's group in Toronto, helped to overcome the optimization difficulties that had previously limited the depth of practical neural networks and was a major proximate cause of the deep-learning resurgence later in the decade.[2]
In 2008, Pascal Vincent, Hugo Larochelle, Bengio and Pierre-Antoine Manzagol introduced the denoising autoencoder in "Extracting and Composing Robust Features with Denoising Autoencoders" at ICML, training networks to reconstruct inputs from corrupted versions, and stacking the resulting layers into deep architectures.[19] In 2009, Bengio also published the single-author monograph "Learning Deep Architectures for AI" in Foundations and Trends in Machine Learning, which laid out the principles, motivations and open questions of the field and which has accumulated thousands of citations as a foundational survey.[20] The 2010 JMLR paper "Why Does Unsupervised Pre-training Help Deep Learning?" with Dumitru Erhan, Aaron Courville, Pierre-Antoine Manzagol, Pascal Vincent and Samy Bengio examined the effect of pretraining on optimization and generalization, framing it as a form of regularization that constrains the function space accessible to subsequent supervised training.[21]
In September 2014, Bengio's student Dzmitry Bahdanau, along with Kyunghyun Cho and Bengio, submitted the paper "Neural Machine Translation by Jointly Learning to Align and Translate" to arXiv. The paper was presented at ICLR 2015 and introduced the attention mechanism, which allowed neural machine translation models to selectively focus on different parts of the source sentence when generating each word of the translation.[22]
Prior to this work, encoder-decoder models for machine translation compressed the entire source sentence into a fixed-length vector, which created an information bottleneck, especially for longer sentences. The attention mechanism solved this problem by letting the decoder look back at all positions in the source sentence and assign different weights to each position, depending on the current step of the translation. Mathematically, given a sequence of encoder hidden states h_j and a current decoder state s_i, the model computes alignment scores e_ij = a(s_i, h_j), normalizes them with softmax to produce attention weights alpha_ij, and forms a context vector c_i = sum_j alpha_ij h_j that is passed to the decoder.[22]
This innovation proved to be far more consequential than its authors initially realized. The attention mechanism became the central building block of the Transformer architecture introduced by Vaswani et al. in the 2017 paper "Attention Is All You Need," which in turn forms the basis of virtually all modern large language models, including GPT, BERT, and their successors.[22]
Bengio is one of the co-authors of the original Generative Adversarial Networks paper, "Generative Adversarial Nets," presented at NeurIPS 2014. The paper was led by his then-PhD student Ian Goodfellow, with Bengio, Aaron Courville, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley and Sherjil Ozair as co-authors, and proposed a two-player minimax game between a generator and a discriminator network.[23] According to a much-recounted anecdote, the core idea was sketched out by Goodfellow at the Montreal bar Les 3 Brasseurs while celebrating a fellow PhD student's graduation, then coded and verified the same night. The framework became one of the dominant approaches to generative modelling for nearly a decade before being partially supplanted by diffusion models.[23]
In May 2015, LeCun, Bengio and Hinton co-authored a short review article titled simply "Deep Learning" in Nature. The article provided an accessible overview of how multilayer networks discover representations of data, describing applications in image recognition, speech recognition and natural language processing, and is widely cited as the moment the field's flagship journal acknowledged deep learning as a mature scientific area.[24]
In 2016, Bengio co-authored the textbook Deep Learning with Ian Goodfellow and Aaron Courville, published by MIT Press. The book provides a comprehensive and mathematically rigorous treatment of deep learning, covering linear algebra, probability theory, information theory, numerical computation, machine learning basics, deep feedforward networks, regularization, optimization, convolutional neural networks, recurrent neural networks, and more advanced topics such as autoencoders, representation learning, and generative models.[25]
The textbook quickly became the standard reference in the field and is widely used in university courses worldwide. It is freely available online at deeplearningbook.org, a decision that has contributed to its reach and impact on the training of a new generation of AI researchers.[25]
Starting in 2021, Bengio and his collaborators introduced Generative Flow Networks (GFlowNets), a class of generative models that learn to sample composite objects proportionally to a given reward function. GFlowNets draw on ideas from reinforcement learning, deep generative models, and energy-based probabilistic modelling. Unlike standard reinforcement learning, which seeks to find the single highest-reward solution, GFlowNets are designed to produce a diverse set of high-reward candidates, with the marginal sampling probability of any complete object proportional to its terminal reward.[26]
The foundational paper, "GFlowNet Foundations," was co-authored by Bengio along with Salem Lahlou, Tristan Deleu, Edward J. Hu, Mo Tiwari, and Emmanuel Bengio, and was published in the Journal of Machine Learning Research in 2023.[27]
GFlowNets have been applied to a variety of problems in scientific discovery, including the design of novel anti-microbial peptides, the discovery of DNA sequences with high binding activity, and the search for proteins with desirable fluorescence properties. They have also been applied to causal discovery and Bayesian structure learning, in part as a building block for Bengio's later AI-safety work on the Scientist AI framework.[26]
In March 2019, the Association for Computing Machinery (ACM) announced that Yoshua Bengio, Geoffrey Hinton, and Yann LeCun would share the 2018 A.M. Turing Award "for conceptual and engineering breakthroughs that have made deep neural networks a critical component of computing." The award, often described as the "Nobel Prize of computing," carries a US$1 million prize, which the three laureates split equally.[2]
The ACM cited the trio's independent yet complementary work spanning decades. Bengio's contributions were recognized particularly for his work on sequence-to-sequence learning, neural machine translation, attention mechanisms, and distributed representations of words, as well as for his broader efforts to establish the mathematical and conceptual foundations of deep learning.[2] He delivered his Turing Lecture at the Heidelberg Laureate Forum on September 23, 2019.[28]
The three researchers are commonly known as the "Godfathers of AI" or "Godfathers of Deep Learning," a title that reflects their decades-long commitment to neural network research through periods when the approach was considered a dead end by much of the mainstream AI community.
The following table lists a selection of Bengio's major awards and honors.
| Award | Year | Awarding body | Notes |
|---|---|---|---|
| Marie-Victorin Prize | 2017 | Government of Quebec | Quebec's top honour for pure and applied science |
| Officer of the Order of Canada | 2017 | Government of Canada | For contributions to AI and deep learning |
| A.M. Turing Award | 2018 | Association for Computing Machinery | Shared with Geoffrey Hinton and Yann LeCun |
| IEEE Neural Networks Pioneer Award | 2019 | IEEE Computational Intelligence Society | |
| Killam Prize in Natural Sciences | 2019 | Canada Council for the Arts | One of Canada's most prestigious research prizes |
| Fellow of the Royal Society of London | 2020 | Royal Society | |
| Princess of Asturias Award for Technical and Scientific Research | 2022 | Princess of Asturias Foundation | Shared with Hinton, LeCun and Demis Hassabis |
| Knight of the Legion of Honor | 2022 | Republic of France | For contributions to science |
| VinFuture Grand Prize | 2024 | VinFuture Foundation | Shared with Hinton, LeCun, Huang, and Fei-Fei Li for deep-learning breakthroughs |
| Queen Elizabeth Prize for Engineering | 2025 | QEPrize Foundation | Shared with Hinton, LeCun, Hopfield, Huang, Dally, and Li; carries a 500,000 GBP prize |
| Honorary doctorate (Doctor of Science) | 2025 | McGill University | Awarded at Spring Convocation 2025 |
| Canada CIFAR AI Chair | Ongoing | CIFAR | For leadership in AI research |
| Fellow of the Royal Society of Canada | 2014 | Royal Society of Canada |
In 2022, Bengio shared the Princess of Asturias Award for Technical and Scientific Research with Geoffrey Hinton, Yann LeCun and Demis Hassabis. The Spanish award jury cited the four for the development of deep learning, the techniques behind contemporary advances in speech recognition, computer vision and natural language processing.[29]
In December 2024, Bengio was one of five recipients of the VinFuture Grand Prize, a US$3 million award given by the Vietnamese VinFuture Foundation. He shared the prize with Hinton, LeCun, Jensen Huang and Fei-Fei Li for "transformational contributions to the advancement of deep learning."[30]
In February 2025, Bengio was named one of seven recipients of the Queen Elizabeth Prize for Engineering, recognised for seminal contributions to the advancement of modern machine learning. He shared the 500,000 GBP prize with Geoffrey Hinton, John Hopfield, Yann LeCun, Jensen Huang, Bill Dally, and Fei-Fei Li. The prize was presented by King Charles III at St James's Palace in May 2025.[31]
On October 24, 2025, the Universite de Montreal announced that Bengio had passed one million career citations on Google Scholar, making him the first computer scientist and the first living scientist of any field to cross that threshold according to the service. The only other researcher tracked by Google Scholar with more than a million citations is the French philosopher and historian Michel Foucault, who is deceased. By the time of the announcement Bengio's profile listed roughly 1.1 million citations and an h-index above 240.[3]
Since approximately 2023, Bengio has become one of the most vocal and prominent advocates for AI safety, regulation, and international governance. His shift in focus from pure research to safety and policy has been widely noted in the scientific and mainstream press.[32]
On March 29, 2023, Bengio became one of the most prominent signatories of "Pause Giant AI Experiments: An Open Letter" published by the Future of Life Institute, which called on "all AI labs to immediately pause for at least 6 months the training of AI systems more powerful than GPT-4." The letter, which ultimately collected more than 30,000 signatures including Stuart Russell, Elon Musk, Steve Wozniak, and Yuval Noah Harari, framed the pause as a window in which to develop independent oversight, audit mechanisms, and provenance standards.[33]
In May 2023, Bengio was among the initial signatories of the Statement on AI Risk issued by the Center for AI Safety (CAIS). The one-sentence statement reads: "Mitigating the risk of extinction from AI should be a global priority alongside other societal-scale risks such as pandemics and nuclear war." Other signatories included Geoffrey Hinton, the CEOs of OpenAI, Google DeepMind, and Anthropic, and hundreds of AI researchers.[34]
On July 25, 2023, Bengio testified before the U.S. Senate Judiciary Subcommittee on Privacy, Technology and the Law on regulation of AI. He told senators that "there is a significant probability that superhuman AI is just a few years away, outpacing our ability to comprehend the various risks and establish sufficient guardrails."[35] Bengio organized his recommendations around four levers: access (limiting who can use the most powerful systems), alignment (ensuring AI behaves according to human intentions), raw intellectual power (constraining compute and algorithmic capability), and scope of actions (restricting how AI agents can affect the world). He also argued that the AI industry cannot be expected to regulate itself and pointed to Canada's Artificial Intelligence and Data Act (AIDA), then under consideration in Parliament, as a possible model.[35] He subsequently participated in Senator Chuck Schumer's eighth AI Insight Forum on AI risk on December 6, 2023, where he warned that "humanity is on a trajectory of accelerating AI advances" and that the timelines for human-level AI had collapsed dramatically since 2019.[36]
On July 9, 2024, Bengio published "Reasoning through arguments against taking AI safety seriously" on his personal blog. The essay laid out and methodically rebutted eight clusters of arguments commonly used to dismiss concern about advanced AI, ranging from the view that artificial general intelligence is centuries away to the position that geopolitical competition makes safety investment counterproductive. Bengio's central contention is that "nobody currently knows how such an AGI or artificial superintelligence could be made to behave morally, or at least behave as intended," and that even moderately low probabilities of catastrophe demand serious policy responses.[37] The essay has been widely circulated in the AI-safety research community and was reprinted by LessWrong and the Future of Life Institute.[37]
In May 2024, Bengio was the lead author of "Managing extreme AI risks amid rapid progress," published in Science. The paper, co-authored with Geoffrey Hinton, Andrew Yao, Dawn Song, Stuart Russell, Daniel Kahneman, Sheila McIlraith, Soren Mindermann, David Krueger and more than 20 other researchers, argued that AI progress could outstrip current safeguards and proposed reorienting research toward proactive risk management, mandatory model evaluations, and government oversight.[38]
Following the November 2023 AI Safety Summit at Bletchley Park, Bengio was appointed chair of the International AI Safety Report, an independent scientific assessment of advanced AI risks commissioned by participating governments. He was a direct ministerial appointment by the United Kingdom's Secretary of State for Science, Innovation and Technology and was named to the United Nations Scientific Advisory Board for Independent Advice on Breakthroughs in Science and Technology in the same period.[39] An interim version was delivered at the AI Seoul Summit on May 22, 2024, and the first full report (arXiv:2501.17805) was published on January 29, 2025, in advance of the AI Action Summit in Paris.[40] The report was the work of 96 contributors and an Expert Advisory Panel nominated by 30 countries plus the UN, OECD, and EU.[40] It catalogued misuse risks such as AI-enabled cyber-attacks, disinformation, sexualised deepfakes and uplift to bioweapon development; malfunction risks such as bias and loss of control; and systemic risks such as labour displacement and infrastructure concentration.[40] An interim Key Update appeared in October 2025, and the second full edition was released on February 3, 2026, written by more than 100 independent experts and again chaired by Bengio. The 2026 edition documented rapid capability gains in agentic and reasoning models, growing concentration of AI infrastructure, and a rising volume of incidents involving deepfakes and other AI-generated harms.[4][41]
Bengio has consistently argued that the AI industry cannot be trusted to regulate itself. He has called for mandatory safety testing for frontier AI models before deployment, the establishment of international regulatory bodies analogous to the International Atomic Energy Agency, and restrictions on the development of autonomous AI agents that could take consequential actions in the real world without adequate human oversight.[38] In testimony to the Canadian House of Commons industry committee, he urged rapid passage of the Artificial Intelligence and Data Act portion of Bill C-27, called for the legislation's definition of "high-impact" systems to include those posing national-security and societal threats, and proposed a mandatory national registry of companies developing powerful AI models that would require disclosures before training is even complete.[42]
In a June 2025 interview, Bengio expressed concern that some advanced AI systems were already beginning to display traits such as deception, reward hacking, and situational awareness, describing these as early indicators of goal misalignment and potentially dangerous behaviours.[43]
On June 3, 2025, Bengio launched LawZero, a nonprofit research organization incubated at Mila and dedicated to developing "safe-by-design" AI systems.[5][44] The launch was accompanied by approximately US$30 million in initial philanthropic funding from donors including the Gates Foundation, Schmidt Sciences, the Future of Life Institute, Open Philanthropy (Coefficient Giving), Founders Pledge, the Silicon Valley Community Foundation, and Skype founding engineer Jaan Tallinn.[45] Bengio is co-president and scientific director of LawZero, while Sam Ramadori, formerly chief executive of BrainBox AI, serves as co-president and executive director.[46]
LawZero's research program is built around the concept of "Scientist AI," a class of systems designed to be non-agentic, memoryless, and stateless. Rather than building AI systems that act in the world and pursue goals autonomously, Bengio's team is working on systems trained to understand the world and provide truthful probabilistic answers, akin to "a selfless idealized and platonic scientist."[5] In Bengio's formulation, a Scientist AI maintains a world model and an inference machine that together compute Bayesian posterior probabilities over hypotheses, given evidence, with the chain of reasoning treated as a structured latent variable. The Scientist AI can act as a guardrail on other AI agents by estimating the probability that a proposed action would cause harm and refusing to authorize actions whose harm probability exceeds a chosen threshold.[5][47]
The foundational paper, "Superintelligent Agents Pose Catastrophic Risks: Can Scientist AI Offer a Safer Path?" (arXiv:2502.15657), was submitted on February 21, 2025, with Bengio as lead author alongside Michael Cohen, Damiano Fornasiere, Joumana Ghosn, Pietro Greiner, Matt MacDermott, Soren Mindermann, Adam Oberman, Jesse Richardson, Oliver Richardson, Marc-Antoine Rondeau, Pierre-Luc St-Charles, and David Williams-King.[47] The paper argues that current commercial trajectories toward superintelligent autonomous agents pose catastrophic risks and that non-agentic Scientist AIs are a tractable alternative that can both accelerate scientific discovery and serve as oversight over more powerful agents. The design borrows heavily from Bengio's GFlowNet work, which provides a mechanism for sampling diverse latent reasoning chains weighted by their posterior probability.[47]
On January 15, 2026, LawZero announced a high-profile board and global advisory council made up of seven leaders from technology, government and academia. The board is chaired by Maria Eitel, founder of the Nike Foundation and Chair Emeritus of the Girl Effect, and includes Mariano-Florentino "Tino" Cuellar, president of the Carnegie Endowment for International Peace and former justice of the Supreme Court of California, with historian Yuval Noah Harari among the global advisors.[48] On March 19, 2026, former New Zealand Prime Minister Dame Jacinda Ardern joined LawZero's Global Advisory Council.[49]
Bengio has described LawZero as an attempt to demonstrate that it is possible to build powerful AI systems that are inherently safer than current approaches, without sacrificing usefulness.[5]
As of mid-2026, Bengio divides his time between his professorship at the Universite de Montreal, his role as founder and scientific advisor at Mila, his leadership of LawZero, and his ongoing work chairing the International AI Safety Report.[1][12] His research group continues to publish actively on GFlowNets, causal reasoning, Bayesian neural networks and the theoretical and engineering foundations of safe AI.[4]
In November 2025, Bengio's Google Scholar profile passed the one-million citation milestone, a figure unmatched by any other computer scientist.[3] He remains the most-cited living scientist across all fields by total citation count.[1]
Bengio's public advocacy continues to focus on the need for proactive governance of AI. He has argued that the rapid pace of AI development, particularly in agentic systems and autonomous AI, creates risks that cannot be addressed after the fact, and has expressed cautious optimism that LawZero's Scientist AI approach could offer a technical path toward AI systems that are both powerful and controllable, while emphasizing that technical solutions alone are insufficient without corresponding regulatory frameworks.[48]
Bengio has supervised more than seventy doctoral students and dozens of postdoctoral fellows over his career at the Universite de Montreal. Many have become independent researchers and leaders at academic and industrial AI laboratories. Notable former students and trainees include Ian Goodfellow (Apple, formerly Google Brain and OpenAI), Aaron Courville (Universite de Montreal, Mila), Hugo Larochelle (current Mila scientific director), Pascal Vincent (Meta AI Research, Mila), Kyunghyun Cho (New York University, Genentech), Dzmitry Bahdanau (ServiceNow Research), Nicolas Le Roux, Olivier Delalleau, Razvan Pascanu (Google DeepMind), David Krueger (University of Cambridge, then Universite de Montreal) and Soren Mindermann (LawZero and Mila), along with Bengio's son Emmanuel Bengio, a long-time GFlowNet collaborator. Bengio is also a mentor in the MATS AI-safety research program.[50]
Bengio is a private individual who rarely discusses his personal life publicly. He lives in Montreal, Quebec, Canada. His younger brother, Samy Bengio, is a senior director of AI and machine learning research at Apple, where he leads the Machine Learning Research team. Samy was previously a Distinguished Scientist at Google Brain for nearly fourteen years before resigning in 2021 in the wake of the high-profile dismissal of researcher Timnit Gebru. The two brothers have collaborated on research, co-authored papers (including the 2010 JMLR analysis of unsupervised pretraining), and have given joint interviews about their parallel careers.[6][51]
| Title | Year | Venue | Co-authors |
|---|---|---|---|
| "Learning Long-Term Dependencies with Gradient Descent Is Difficult" | 1994 | IEEE Transactions on Neural Networks | Simard, Frasconi |
| "A Neural Probabilistic Language Model" | 2003 | Journal of Machine Learning Research | Ducharme, Vincent, Jauvin |
| "Extracting and Composing Robust Features with Denoising Autoencoders" | 2008 | ICML | Vincent (lead), Larochelle, Manzagol |
| "Curriculum Learning" | 2009 | ICML | Louradour, Collobert, Weston |
| "Learning Deep Architectures for AI" | 2009 | Foundations and Trends in Machine Learning | (single author) |
| "Why Does Unsupervised Pre-training Help Deep Learning?" | 2010 | Journal of Machine Learning Research | Erhan (lead), Courville, Manzagol, Vincent, S. Bengio |
| "Generative Adversarial Nets" | 2014 | NeurIPS | Goodfellow (lead), Pouget-Abadie, Mirza, Xu, Warde-Farley, Ozair, Courville |
| "Neural Machine Translation by Jointly Learning to Align and Translate" | 2014 (arXiv), 2015 (ICLR) | ICLR 2015 | Bahdanau, Cho |
| "Deep Learning" | 2015 | Nature | LeCun (lead), Hinton |
| Deep Learning (textbook) | 2016 | MIT Press | Goodfellow, Courville |
| "GFlowNet Foundations" | 2023 | Journal of Machine Learning Research | Lahlou, Deleu, Hu, Tiwari, E. Bengio |
| "Managing Extreme AI Risks amid Rapid Progress" | 2024 | Science | Hinton, Yao, Song, Russell, Kahneman, et al. |
| International AI Safety Report 2025 | 2025 | arXiv:2501.17805 | 95 co-authors, Expert Advisory Panel of 30 nations |
| "Superintelligent Agents Pose Catastrophic Risks: Can Scientist AI Offer a Safer Path?" | 2025 | arXiv:2502.15657 | Cohen, Mindermann, Fornasiere, Ghosn, et al. |
| International AI Safety Report 2026 | 2026 | DSIT / International AI Safety Report | Over 100 co-authors |