| Yoshua Bengio |
|---|
| Born |
| Nationality |
| Education |
| Known for |
| Institutions |
| Awards |
| Notable work |
Yoshua Bengio (born March 5, 1964) is a Canadian computer scientist of Moroccan-French origin, widely recognized as one of the pioneers of deep learning and artificial intelligence. He is a Full Professor in the Department of Computer Science and Operations Research at the Universite de Montreal, the founder and scientific advisor of Mila (the Quebec Artificial Intelligence Institute), and the co-president and scientific director of LawZero, a nonprofit focused on safe AI development.[1] Along with Geoffrey Hinton and Yann LeCun, he received the 2018 ACM A.M. Turing Award for conceptual and engineering breakthroughs that have made deep neural networks a critical component of modern computing.[2]
Bengio is the most-cited computer scientist in the world by total citations and by h-index, and the most-cited living scientist across all fields by total citations. In November 2025, he became the first AI researcher to surpass one million citations on Google Scholar.[3] His research contributions span neural probabilistic language models, the attention mechanism for neural machine translation, generative adversarial networks, denoising autoencoders, and generative flow networks (GFlowNets). In recent years, he has become one of the most prominent voices calling for regulation and safety measures in AI development.
Yoshua Bengio was born on March 5, 1964, in Paris, France, to a Sephardic Jewish family that had emigrated from Morocco. His father, Carlo Bengio, was a pharmacist and playwright who ran a Sephardic theater company in Montreal that performed pieces in Judeo-Arabic. His mother, Celia Moreno, was an actress in the 1970s Moroccan theater scene led by Tayeb Seddiki; she later studied economics in Paris and co-founded l'Ecran humain, a multimedia theater troupe, in Montreal in 1980. Both parents were shaped by the countercultural movements of the 1960s, and their interests leaned more toward the arts than the sciences.[4]
The family relocated to Montreal, Canada, when Yoshua was around twelve years old. He and his younger brother Samy, who would also become a prominent computer scientist (now senior director of AI and ML research at Apple), shared an early fascination with computing. The two brothers pooled money they had earned from newspaper delivery routes to purchase an Apple II and an Atari 800, machines that sparked their lifelong careers.[4]
Bengio completed all of his higher education at McGill University in Montreal. He earned a Bachelor of Engineering in computer engineering in 1986, followed by a Master of Science in computer science in 1988, and a Doctor of Philosophy in computer science in 1991.[5] His doctoral research focused on artificial neural networks, a field that was at the time considered unfashionable by much of the academic computer science community.
After completing postdoctoral work at MIT and AT&T Bell Laboratories, Bengio joined the Universite de Montreal as a professor in 1993, where he has remained for more than three decades.[6]
Bengio has been a Full Professor of Computer Science at the Universite de Montreal since 1993. Throughout the 1990s and 2000s, he continued to develop and advocate for neural network approaches at a time when many in the AI research community had turned away from them in favor of other machine learning methods such as support vector machines and kernel methods. His persistence, alongside that of Hinton and LeCun, was instrumental in the eventual resurgence of neural networks and the deep learning revolution that would reshape the field.
Bengio holds the Canada CIFAR AI Chair, recognizing his leadership in AI research within the Canadian academic ecosystem.[7]
Bengio founded Mila, originally known as the Montreal Institute for Learning Algorithms, which grew into the Quebec Artificial Intelligence Institute. Under his direction, Mila became one of the world's largest academic research centers in deep learning and AI, attracting hundreds of researchers and graduate students from around the globe. The institute has been a training ground for many of the researchers and engineers who went on to shape the modern AI industry at organizations like Google, Meta, and numerous startups.
Bengio served as Mila's scientific director for many years. By 2025, he had transitioned to the role of scientific advisor, reflecting his shift in focus toward AI safety research and the founding of LawZero.[8]
Bengio's research output has been exceptionally broad and influential. The table below summarizes several of his most important contributions.
| Contribution | Year(s) | Key Collaborators | Significance |
|---|---|---|---|
| Neural probabilistic language model | 2003 | Rejean Ducharme, Pascal Vincent, Christian Jauvin | Introduced word embeddings as distributed representations; foundational for all later language models |
| Attention mechanism for NMT | 2014-2015 | Dzmitry Bahdanau, Kyunghyun Cho | Enabled neural networks to focus on relevant parts of input sequences; precursor to the Transformer architecture |
| Curriculum learning | 2009 | (solo author) | Proposed training neural networks on progressively harder examples |
| Denoising autoencoders | 2008 | Pascal Vincent, Hugo Larochelle, others | Advanced unsupervised feature learning through corruption-based training |
| GAN training improvements | 2014-2016 | Various | Contributed techniques for stabilizing generative adversarial network training |
| Deep Learning textbook | 2016 | Ian Goodfellow, Aaron Courville | The definitive textbook on deep learning, published by MIT Press |
| Generative Flow Networks (GFlowNets) | 2021-present | Emmanuel Bengio, Moksh Jain, Nikolay Malkin, others | A novel framework for diverse candidate generation combining ideas from reinforcement learning and probabilistic modeling |
In 2003, Bengio, along with Rejean Ducharme, Pascal Vincent, and Christian Jauvin, published "A Neural Probabilistic Language Model" in the Journal of Machine Learning Research. This paper introduced the concept of learning distributed representations for words, now known as word embeddings, and using a neural network to predict the next word in a sequence based on the context provided by its predecessors.[9]
Before this work, language models relied primarily on n-gram statistics, which suffered from the curse of dimensionality: as vocabulary sizes and context lengths grew, the number of possible word combinations exploded, and most sequences in test data had never been seen during training. Bengio's approach addressed this by mapping each word to a continuous vector in a lower-dimensional space, allowing the model to generalize from seen sequences to similar unseen ones.
This paper is now regarded as one of the foundational works that led to Word2Vec (2013), GloVe, and eventually the contextual embeddings used in modern large language models such as BERT and the GPT series.
In September 2014, Bengio's student Dzmitry Bahdanau, along with Kyunghyun Cho and Bengio, submitted the paper "Neural Machine Translation by Jointly Learning to Align and Translate" to arXiv. The paper was presented at ICLR 2015 and introduced the attention mechanism, which allowed neural machine translation models to selectively focus on different parts of the source sentence when generating each word of the translation.[10]
Prior to this work, encoder-decoder models for machine translation compressed the entire source sentence into a fixed-length vector, which created an information bottleneck, especially for longer sentences. The attention mechanism solved this problem by letting the decoder look back at all positions in the source sentence and assign different weights to each position, depending on the current step of the translation.
This innovation proved to be far more consequential than its authors initially realized. The attention mechanism became the central building block of the Transformer architecture introduced by Vaswani et al. in the 2017 paper "Attention Is All You Need," which in turn forms the basis of virtually all modern large language models, including GPT, BERT, and their successors.
In 2016, Bengio co-authored the textbook Deep Learning with Ian Goodfellow and Aaron Courville, published by MIT Press. The book provides a comprehensive and mathematically rigorous treatment of deep learning, covering linear algebra, probability theory, information theory, numerical computation, machine learning basics, deep feedforward networks, regularization, optimization, convolutional neural networks, recurrent neural networks, and more advanced topics such as autoencoders, representation learning, and generative models.[11]
The textbook quickly became the standard reference in the field and is widely used in university courses worldwide. It is freely available online at deeplearningbook.org, a decision that has contributed to its enormous reach and impact on the training of a new generation of AI researchers.
Starting in 2021, Bengio and his collaborators introduced Generative Flow Networks (GFlowNets), a novel class of generative models that learn to sample composite objects proportionally to a given reward function. GFlowNets draw on ideas from reinforcement learning, deep generative models, and energy-based probabilistic modeling. Unlike standard reinforcement learning, which seeks to find the single highest-reward solution, GFlowNets are designed to produce a diverse set of high-reward candidates.[12]
The foundational paper, "GFlowNet Foundations," was co-authored by Bengio along with Salem Lahlou, Tristan Deleu, Edward J. Hu, Mo Tiwari, and Emmanuel Bengio, and was published in the Journal of Machine Learning Research in 2023.[13]
GFlowNets have been applied to a variety of problems in scientific discovery, including the design of novel anti-microbial peptides, the discovery of DNA sequences with high binding activity, and the search for proteins with desirable fluorescence properties. They have also been applied to causal discovery and Bayesian structure learning.
In March 2019, the Association for Computing Machinery (ACM) announced that Yoshua Bengio, Geoffrey Hinton, and Yann LeCun would share the 2018 A.M. Turing Award "for conceptual and engineering breakthroughs that have made deep neural networks a critical component of computing." The award, often described as the "Nobel Prize of computing," carries a $1 million prize, which the three laureates split equally.[2]
The ACM cited the trio's independent yet complementary work spanning decades. Bengio's contributions were recognized particularly for his work on sequence-to-sequence learning, neural machine translation, attention mechanisms, and distributed representations of words, as well as for his broader efforts to establish the mathematical and conceptual foundations of deep learning.
The three researchers are commonly known as the "Godfathers of AI" or "Godfathers of Deep Learning," a title that reflects their decades-long commitment to neural network research through periods when the approach was considered a dead end by much of the mainstream AI community.
The following table lists a selection of Bengio's major awards and honors.
| Award | Year | Awarding Body | Notes |
|---|---|---|---|
| A.M. Turing Award | 2018 | Association for Computing Machinery | Shared with Geoffrey Hinton and Yann LeCun |
| Queen Elizabeth Prize for Engineering | 2025 | QEPrize Foundation | Shared with Hinton, LeCun, Hopfield, Huang, Dally, and Li; carries a 500,000 GBP prize |
| Officer of the Order of Canada | 2017 | Government of Canada | For contributions to AI and deep learning |
| Knight of the Legion of Honor | 2022 | Republic of France | For contributions to science |
| Fellow of the Royal Society of London | 2020 | Royal Society | |
| Fellow of the Royal Society of Canada | 2014 | Royal Society of Canada | |
| Canada CIFAR AI Chair | Ongoing | CIFAR | For leadership in AI research |
In February 2025, Bengio was named one of seven recipients of the Queen Elizabeth Prize for Engineering, recognized for seminal contributions to the advancement of modern machine learning. He shared the 500,000 GBP prize with Geoffrey Hinton, John Hopfield, Yann LeCun, Jensen Huang, Bill Dally, and Fei-Fei Li. The prize was presented by King Charles III at St James's Palace.[14]
Since approximately 2023, Bengio has become one of the most vocal and prominent advocates for AI safety, regulation, and international governance. His shift in focus from pure research to safety and policy has been widely noted in the scientific and mainstream press.
In May 2023, Bengio was among the initial signatories of the Statement on AI Risk issued by the Center for AI Safety (CAIS). The one-sentence statement reads: "Mitigating the risk of extinction from AI should be a global priority alongside other societal-scale risks such as pandemics and nuclear war." Other signatories included Geoffrey Hinton, the CEOs of OpenAI, Google DeepMind, and Anthropic, and hundreds of AI researchers.[15]
Bengio testified before the US Senate Subcommittee on Privacy, Technology, and the Law, as part of Senator Chuck Schumer's AI Insight Forums. In his written testimony, Bengio warned that "there is a significant probability that superhuman AI is just a few years away, outpacing our ability to comprehend the various risks and establish sufficient guardrails, particularly against the more catastrophic scenarios."[16]
He told senators that sophisticated AI systems lower the barrier to entry for malicious actors to create bioweapons, chemical weapons, and malware, and that these risks could grow more dangerous as human oversight diminishes. He called for governments to limit access to the most powerful AI systems and to ban applications that cannot be convincingly demonstrated to be safe. He highlighted Canada's Artificial Intelligence and Data Act (AIDA) as a potential model for other countries to follow.
Bengio served as chair of the International AI Safety Report, an independent scientific assessment of AI risks commissioned by participating governments following the 2023 AI Safety Summit at Bletchley Park, UK. The first full version was published in January 2025, and the 2026 edition, also chaired by Bengio, highlighted the rapid improvement of general-purpose AI capabilities, uneven global adoption rates, and the growing number of incidents related to deepfakes and other AI-generated harms.[17]
Bengio has consistently argued that the AI industry cannot be trusted to regulate itself. He has called for mandatory safety testing for frontier AI models before deployment, the establishment of international regulatory bodies analogous to the International Atomic Energy Agency, and restrictions on the development of autonomous AI agents that could take consequential actions in the real world without adequate human oversight.
In a June 2025 interview, Bengio expressed concern that some advanced AI systems were already beginning to display traits such as deception, reward hacking, and situational awareness, describing these as early indicators of goal misalignment and potentially dangerous behaviors.[18]
In June 2025, Bengio launched LawZero, a nonprofit research organization dedicated to developing "safe-by-design" AI systems. The nonprofit was backed by approximately $30 million in initial funding from the Gates Foundation, Coefficient Giving (formerly Open Philanthropy), the Future of Life Institute, and other donors focused on existential risk reduction.[19]
LawZero's research program is built around the concept of a "Scientist AI," which differs fundamentally from the agentic AI systems being developed by commercial labs. Rather than building AI systems designed to act in the world and pursue goals autonomously, Bengio's team is working on systems trained to understand the world and provide truthful answers based on transparent, probabilistic reasoning grounded in the scientific method and formal logic.[20]
In January 2026, LawZero announced the appointment of a high-profile board and global advisory council. The board includes NIKE Foundation founder Maria Eitel as chair, Mariano-Florentino Cuellar (president of the Carnegie Endowment for International Peace), and historian Yuval Noah Harari. The advisory council further reflects the global and interdisciplinary scope of Bengio's ambitions for the organization.[21]
Bengio has described LawZero as an attempt to demonstrate that it is possible to build powerful AI systems that are inherently safer than current approaches, without sacrificing usefulness.
As of early 2026, Bengio divides his time between his professorship at the Universite de Montreal, his role as scientific advisor at Mila, his leadership of LawZero, and his ongoing work chairing the International AI Safety Report. His research group continues to publish actively on GFlowNets, causal reasoning, and the theoretical foundations of safe AI.
In November 2025, Bengio reached the milestone of one million citations on Google Scholar, a figure unmatched by any other computer scientist.[3] He remains the most-cited living scientist across all fields by total citation count.
Bengio's public advocacy continues to focus on the need for proactive governance of AI. He has argued that the rapid pace of AI development, particularly in agentic systems and autonomous AI, creates risks that cannot be addressed after the fact. He has expressed cautious optimism that LawZero's "Scientist AI" approach could offer a technical path toward AI systems that are both powerful and controllable, while emphasizing that technical solutions alone are insufficient without corresponding regulatory frameworks.[21]
Bengio is a private individual who rarely discusses his personal life publicly. He lives in Montreal, Quebec, Canada. His brother, Samy Bengio, is a senior director of AI and ML research at Apple and a well-known researcher in his own right, with extensive work on neural networks, speech recognition, and machine learning.[4]
| Title | Year | Venue | Co-Authors |
|---|---|---|---|
| "A Neural Probabilistic Language Model" | 2003 | Journal of Machine Learning Research | Ducharme, Vincent, Jauvin |
| "Curriculum Learning" | 2009 | ICML | Solo |
| "Neural Machine Translation by Jointly Learning to Align and Translate" | 2014 (arXiv), 2015 (ICLR) | ICLR 2015 | Bahdanau, Cho |
| Deep Learning (textbook) | 2016 | MIT Press | Goodfellow, Courville |
| "Generative Adversarial Nets" | 2014 | NeurIPS | Goodfellow, Pouget-Abadie, Mirza, Xu, Warde-Farley, Ozair, Courville |
| "GFlowNet Foundations" | 2023 | Journal of Machine Learning Research | Lahlou, Deleu, Hu, Tiwari, E. Bengio |