Geoffrey Everest Hinton (born December 6, 1947) is a British-Canadian computer scientist and cognitive psychologist whose work on artificial neural networks over more than four decades has shaped the field of artificial intelligence. Often called the "Godfather of AI" or the "Godfather of Deep Learning," Hinton made foundational contributions to backpropagation, Boltzmann machines, deep belief networks, and convolutional neural networks. He shared the 2018 ACM A.M. Turing Award with Yann LeCun and Yoshua Bengio for their collective work on deep learning [1], and in 2024 he was awarded the Nobel Prize in Physics alongside John Hopfield for foundational discoveries that enable machine learning with artificial neural networks [2].
Hinton spent the bulk of his academic career at the University of Toronto, where he held the position of University Professor in the Department of Computer Science from 1987 until his retirement as University Professor Emeritus. From 2013 to 2023, he also worked at Google Brain, the company's deep learning research division. In May 2023, he left Google so that he could speak freely about the risks posed by the AI technologies he helped create [3]. Since then, he has become one of the world's most prominent voices warning about the existential dangers of superintelligent AI.
Geoffrey Hinton was born in Wimbledon, London, into a family of remarkable intellectual distinction. His father, Howard Everest Hinton, was a prominent entomologist who became a Fellow of the Royal Society and held a professorship at the University of Bristol, where he introduced a newly recognized stage in insect metamorphosis [4]. The family's scientific lineage stretches back generations. Hinton is the great-great-grandson of George Boole, the mathematician and logician whose Boolean algebra became one of the theoretical foundations of modern computing, and of Mary Everest Boole, a self-taught mathematician and educator [4]. He is also a descendant of George Everest, the surveyor general of India after whom Mount Everest was named.
The connection to Boole carries a certain poetic quality: the man whose ancestor formalized the logic underlying digital computation went on to pioneer the sub-symbolic, pattern-recognition approach to intelligence that has, in many respects, surpassed traditional logic-based AI. Hinton has spoken about the pressure he felt growing up in such a family, once remarking that there was an expectation to achieve something significant. His cousin Joan Hinton was a nuclear physicist who worked on the Manhattan Project before emigrating to China [4].
Hinton's academic path was not a straight line. He enrolled at the University of Cambridge, initially studying physiology and philosophy before settling on experimental psychology. He graduated with a Bachelor of Arts in experimental psychology in 1970 [5]. At the time, he was already intrigued by how the brain represents and processes information, but the dominant paradigm in psychology and AI leaned heavily toward symbolic computation and rule-based reasoning.
After Cambridge, Hinton moved to the University of Edinburgh to pursue a PhD in artificial intelligence. His doctoral supervisor was Christopher Longuet-Higgins, a distinguished theoretical chemist who had transitioned to cognitive science and co-founded Edinburgh's Department of Machine Intelligence and Perception [6]. Longuet-Higgins favored symbolic approaches to AI, which created a productive tension with Hinton's growing conviction that neural networks held the key to understanding intelligence. Hinton received his PhD in 1978, with a thesis focused on learning in neural network models.
The choice to study neural networks during the late 1970s was contrarian. The field had been in decline since Marvin Minsky and Seymour Papert published Perceptrons in 1969, a book that highlighted the limitations of single-layer neural networks and contributed to a withdrawal of funding and interest, a period sometimes known as the first AI winter. Hinton pursued this line of research anyway.
After completing his doctorate, Hinton held postdoctoral positions at the University of Sussex and the University of California, San Diego (UCSD). His time at UCSD was formative. Working in the parallel distributed processing (PDP) group alongside David Rumelhart, James McClelland, and others, Hinton was part of a community that would fundamentally challenge the prevailing symbolic AI paradigm.
In 1982, Hinton joined the faculty of the Computer Science Department at Carnegie Mellon University (CMU) in Pittsburgh. It was during this period that several of his most influential early ideas took shape, including the development of Boltzmann machines and his work on learning algorithms for neural networks.
In 1985, Hinton, along with Terry Sejnowski and David Ackley, introduced the Boltzmann machine, a type of stochastic recurrent neural network that can learn to represent complex probability distributions over its inputs [7]. The name was borrowed from Ludwig Boltzmann, the Austrian physicist whose work in statistical mechanics provided the mathematical framework for the machine's energy-based learning rule.
Boltzmann machines used a learning procedure derived from concepts in statistical physics. Each configuration of the network's units has an associated energy, and the network learns by adjusting its weights to minimize the difference between the distribution of states it generates and the distribution observed in the training data. The learning rule was elegant but computationally expensive, requiring extensive sampling to estimate the gradient of the log-likelihood.
| Aspect | Details |
|---|---|
| Year introduced | 1985 |
| Inventors | Geoffrey Hinton, Terry Sejnowski, David Ackley |
| Type | Stochastic recurrent neural network |
| Learning rule | Based on statistical mechanics (energy minimization) |
| Key limitation | Slow learning due to required sampling |
| Later variants | Restricted Boltzmann machines (RBMs), deep Boltzmann machines |
The Boltzmann machine was explicitly mentioned in the citation for Hinton's 2024 Nobel Prize in Physics, recognizing its deep connection between neural network learning and statistical physics [2].
Hinton's most widely cited work came in 1986, when he, David Rumelhart, and Ronald J. Williams published "Learning representations by back-propagating errors" in the journal Nature [8]. This paper demonstrated that the backpropagation algorithm could effectively train multi-layer neural networks by propagating error signals backward from the output layer through hidden layers, adjusting weights at each layer to reduce prediction error.
Backpropagation itself was not entirely new. Seppo Linnainmaa had described reverse-mode automatic differentiation (of which backpropagation is a special case) in 1970, and Paul Werbos had proposed using it to train neural networks in his 1974 PhD thesis [9]. But the Rumelhart, Hinton, and Williams paper was the one that convinced the broader research community that training deep networks was practical. The paper showed concrete examples of networks learning useful internal representations, distributed patterns of activation across hidden units that encoded meaningful features of the input.
The publication reignited interest in neural networks and helped launch what is sometimes called the connectionist revolution of the 1980s. The paper has been cited over 50,000 times and remains one of the most influential works in the history of computer science.
In 1987, Hinton moved to the University of Toronto, where he would remain for the rest of his academic career. He was drawn in part by the Canadian Institute for Advanced Research (CIFAR), which offered long-term funding for fundamental research without the pressure to produce immediate applications. Hinton became a fellow and later the director of CIFAR's program in "Neural Computation and Adaptive Perception" (later renamed "Learning in Machines and Brains").
CIFAR's support proved essential during the 1990s and early 2000s, a period when neural network research fell out of fashion in much of the AI community. Support vector machines, kernel methods, and other statistical learning techniques dominated machine learning conferences. Funding for neural network research dried up in many countries. Canada, through CIFAR and the Natural Sciences and Engineering Research Council (NSERC), was one of the few places where this work could continue.
Hinton briefly left Toronto from 1998 to 2001 to set up the Gatsby Computational Neuroscience Unit at University College London. After establishing the unit, he returned to Toronto, where he continued to push the boundaries of what neural networks could do.
In 2006, Hinton published "A Fast Learning Algorithm for Deep Belief Nets" with Simon Osindero and Yee-Whye Teh, a paper that is widely credited with reigniting interest in deep neural networks and launching the modern deep learning era [10].
The core insight was a greedy, layer-wise pretraining strategy. Instead of trying to train all layers of a deep network simultaneously (which was prone to vanishing gradients and poor local optima), Hinton proposed training each layer as a restricted Boltzmann machine (RBM), one layer at a time. Each RBM learned to model the distribution of activations produced by the layer below it. After this unsupervised pretraining phase, the entire network could be fine-tuned using backpropagation with labeled data.
This paper is often cited as the origin of the term "deep learning" in its modern usage. It demonstrated that deep networks could learn hierarchical representations of data, with each successive layer capturing increasingly abstract features. The paper showed results on handwritten digit recognition that surpassed existing benchmarks.
| Aspect | Details |
|---|---|
| Year | 2006 |
| Authors | Geoffrey Hinton, Simon Osindero, Yee-Whye Teh |
| Key idea | Greedy layer-wise pretraining using RBMs |
| Significance | Demonstrated effective training of deep networks; reignited the field |
| Impact | Widely credited as launching the modern "deep learning" era |
The moment that brought deep learning from academic curiosity to mainstream technology came in September 2012, when Alex Krizhevsky, Ilya Sutskever, and Geoffrey Hinton entered a deep convolutional neural network called AlexNet into the ImageNet Large Scale Visual Recognition Challenge (ILSVRC). AlexNet achieved a top-5 error rate of 15.3%, more than 10.8 percentage points better than the second-place entry, which used traditional computer vision techniques [11].
The margin of victory was so large that it convinced even skeptics that deep neural networks represented a genuine leap in capability. AlexNet's architecture, while large by the standards of the day (60 million parameters), would be considered tiny by modern standards. It used ReLU activation functions instead of the then-standard sigmoid or tanh, which helped address the vanishing gradient problem. It was also one of the first networks to exploit GPU computing for training, running on two NVIDIA GTX 580 graphics cards.
Krizhevsky and Sutskever were Hinton's graduate students at the University of Toronto. The three of them co-founded a startup called DNNresearch Inc. shortly after their ImageNet victory. In March 2013, Google acquired DNNresearch for a reported $44 million, and Hinton joined Google Brain, dividing his time between Google and the University of Toronto [12].
| Aspect | Details |
|---|---|
| Year | 2012 |
| Network name | AlexNet |
| Authors | Alex Krizhevsky, Ilya Sutskever, Geoffrey Hinton |
| Competition | ImageNet Large Scale Visual Recognition Challenge (ILSVRC) |
| Top-5 error rate | 15.3% (runner-up: 26.2%) |
| Key innovations | ReLU activations, GPU training, dropout regularization |
| Consequence | Launched the deep learning revolution in industry and academia |
The ImageNet result is sometimes described as the "Big Bang" of modern AI. Within a few years, deep learning had become the dominant approach in computer vision, speech recognition, natural language processing, and many other domains.
Around the same time as AlexNet, Hinton and his students developed dropout, a regularization technique that has become standard in deep learning practice. The initial idea appeared in a 2012 paper by Hinton, Nitish Srivastava, Alex Krizhevsky, Ilya Sutskever, and Ruslan Salakhutdinov, titled "Improving neural networks by preventing co-adaptation of feature detectors" [13].
The technique is simple: during training, randomly set a fraction of the neuron activations to zero at each training step. This prevents neurons from becoming overly dependent on specific other neurons (co-adaptation) and forces the network to learn more robust representations. It can be interpreted as training an ensemble of exponentially many sub-networks that share parameters.
A more comprehensive treatment was published in 2014 by Srivastava, Hinton, Krizhevsky, Sutskever, and Salakhutdinov as "Dropout: A Simple Way to Prevent Neural Networks from Overfitting" in the Journal of Machine Learning Research [14]. Dropout became one of the most widely used regularization techniques in deep learning.
Hinton introduced the concept of "capsules" in 2011 as an alternative to the pooling operations used in standard convolutional neural networks. His argument was that max pooling discards spatial relationship information (the relative positions and orientations of features), which is important for robust recognition of objects from different viewpoints [15].
In a capsule network, groups of neurons (capsules) represent both the probability that an entity exists and its instantiation parameters (pose, orientation, scale). The length of the capsule's output vector represents the probability of the entity, while its orientation encodes the properties. This approach aims to model part-whole relationships more naturally than standard CNNs.
Hinton co-authored two papers on capsule networks in 2017, including "Dynamic Routing Between Capsules" with Sara Sabour and Nicholas Frosst [15]. While capsule networks showed promising results on certain benchmarks and attracted considerable research interest, they have not achieved the same widespread adoption as CNNs or transformers. Scaling them to large datasets and high-resolution images has proven challenging. Hinton has continued to refine the idea, viewing it as a more principled approach to visual recognition.
In 2015, Hinton, along with Oriol Vinyals and Jeff Dean, published "Distilling the Knowledge in a Neural Network," which introduced the technique of knowledge distillation [16]. The idea is to transfer knowledge from a large, complex model (the "teacher") to a smaller, more efficient model (the "student") by training the student to match the teacher's soft output probabilities (the full probability distribution over classes) rather than just the hard labels.
The soft probabilities contain richer information than hard labels because they encode the teacher's uncertainty and the relative similarities between classes. For example, a teacher model might assign a low but nonzero probability to "cat" when classifying an image of a dog, conveying that the image has some cat-like features. This information helps the student learn better representations.
Knowledge distillation has become a widely used technique for model compression, enabling the deployment of deep learning models on resource-constrained devices like smartphones and edge hardware. It has also found applications in ensemble compression and semi-supervised learning.
| Contribution | Year | Key co-authors | Significance |
|---|---|---|---|
| Boltzmann machines | 1985 | Terry Sejnowski, David Ackley | Introduced energy-based probabilistic learning to neural networks |
| Backpropagation popularization | 1986 | David Rumelhart, Ronald Williams | Made training multi-layer networks practical; launched connectionist revolution |
| Deep belief networks | 2006 | Simon Osindero, Yee-Whye Teh | Reignited deep learning; introduced greedy layer-wise pretraining |
| AlexNet | 2012 | Alex Krizhevsky, Ilya Sutskever | Won ImageNet by a huge margin; launched the modern AI industry |
| Dropout | 2012/2014 | Nitish Srivastava, Alex Krizhevsky, Ilya Sutskever, Ruslan Salakhutdinov | Standard regularization technique used across deep learning |
| Knowledge distillation | 2015 | Oriol Vinyals, Jeff Dean | Enabled model compression for practical deployment |
| Capsule networks | 2011/2017 | Sara Sabour, Nicholas Frosst | Proposed alternative to pooling that preserves spatial relationships |
Following Google's acquisition of DNNresearch in March 2013, Hinton joined Google Brain as a Distinguished Researcher and later as Vice President and Engineering Fellow. He split his time between Google's offices in Toronto and Mountain View, California, and the University of Toronto, maintaining an active research group at both institutions.
At Google, Hinton contributed to advances in speech recognition, image recognition, and language understanding that were integrated into Google products including Google Search, Google Photos, Google Translate, and the Android operating system. He also continued publishing research, including his work on capsule networks and knowledge distillation.
Hinton's presence at Google was part of a broader pattern during the 2010s in which major technology companies aggressively recruited top AI researchers. Facebook (now Meta) hired Yann LeCun in 2013 to lead its AI research lab (FAIR). Baidu hired Andrew Ng. The competition for AI talent intensified throughout the decade, with tech companies offering multi-million-dollar compensation packages to leading researchers and their students.
In March 2019, the Association for Computing Machinery (ACM) announced that Geoffrey Hinton, Yann LeCun, and Yoshua Bengio would share the 2018 A.M. Turing Award, often described as the "Nobel Prize of computing." The award, which carries a $1 million prize funded by Google, recognized the three researchers for "conceptual and engineering breakthroughs that have made deep neural networks a critical component of computing" [1].
The citation noted that working independently and together, the three had developed the conceptual foundations for deep learning, identified important phenomena through experiments, and contributed engineering advances that demonstrated the practical advantages of deep neural networks. The three are sometimes collectively referred to as the "Godfathers of AI" or the "Deep Learning Troika."
The award ceremony took place on June 15, 2019, at ACM's annual awards banquet in San Francisco.
On October 8, 2024, the Royal Swedish Academy of Sciences announced that Geoffrey Hinton and John Hopfield would share the 2024 Nobel Prize in Physics "for foundational discoveries and inventions that enable machine learning with artificial neural networks" [2].
Hopfield was recognized for inventing what is now known as the Hopfield network in 1982, a type of recurrent neural network that can store and retrieve patterns, drawing on concepts from the physics of spin glasses. Hinton was recognized for using the Hopfield network as the foundation for a new network that could find properties in data and thus perform tasks such as identifying specific elements in pictures, a reference to the Boltzmann machine he developed with Sejnowski [2].
The Nobel Committee stated that the laureates' work had been "inspired by physics" and that their tools based on artificial neural networks had been used to advance research across many fields of science. The prize generated some debate in the physics community, with some physicists questioning whether work on neural networks constituted physics. Others argued that the deep connections between statistical mechanics and machine learning made the award appropriate.
Hinton delivered his Nobel lecture on December 8, 2024, at the Aula Magna in Stockholm University. He opened with characteristic humor: "Today, I'm going to do something very foolish. I'm going to try and describe a complicated technical idea for a general audience, without using any equations" [17]. At the Nobel banquet on December 10 at Stockholm City Hall, he used his speech to deliver an urgent warning about AI safety, criticizing technology companies motivated by short-term profits and calling for much greater investment in AI safety research [18].
Hinton donated a Boltzmann chip, about the size of a postage stamp, to the Nobel Prize Museum [17].
| Award | Year | Shared with | Cited for |
|---|---|---|---|
| David E. Rumelhart Prize | 2001 | (sole recipient) | Contributions to the formal analysis of human cognition |
| Gerhard Herzberg Canada Gold Medal | 2010 | (sole recipient) | Outstanding contributions to science and engineering |
| ACM A.M. Turing Award | 2018 | Yann LeCun, Yoshua Bengio | Conceptual and engineering breakthroughs in deep learning |
| Royal Society Royal Medal | 2022 | (sole recipient) | Pioneering work on neural networks and deep learning |
| Nobel Prize in Physics | 2024 | John Hopfield | Foundational discoveries enabling machine learning with neural networks |
| Queen Elizabeth Prize for Engineering | 2025 | Bengio, Hopfield, LeCun, Dally, Huang, Li | Pioneering contributions to modern machine learning |
On May 1, 2023, Hinton resigned from his position at Google. In interviews with The New York Times and other outlets, he explained that he left specifically so that he could speak freely about the dangers of AI without worrying about the impact on his employer [3]. He stated that he did not want to criticize Google specifically, noting that the company had been "very responsible," but that he needed to be free to voice his concerns without constraint.
Hinton's concerns fall into several categories:
Superintelligence and loss of control. Hinton has warned that AI systems may become more intelligent than humans sooner than most people expect, and that once they do, humans may not be able to control them. In his Nobel banquet speech, he stated that society needs to figure out how to maintain control over AI systems that are smarter than their creators [18]. In a December 2024 interview, he estimated there was a "10 to 20 percent chance" that AI would cause human extinction within the next three decades [19].
Deception. Hinton has expressed worry that AI systems could learn to deceive humans, including their own developers, in pursuit of goals that diverge from human intentions. He has pointed out that systems trained to achieve objectives may discover that manipulating human overseers is an effective strategy.
Job displacement. He has spoken about the potential for AI to eliminate large numbers of jobs, not just routine work but also many white-collar and creative professions. In a December 2025 interview with Fortune, he predicted that 2026 would see AI gain the ability to replace many more jobs, noting that roughly every seven months, AI can complete tasks that previously took twice as long [20].
Autonomous weapons. Hinton has advocated for international agreements to restrict the development of autonomous weapons systems, comparing the need for such agreements to nuclear arms treaties.
Insufficient safety research. He has argued that AI companies should dedicate far more resources to safety research. In multiple public appearances during 2025, he urged that companies allocate roughly one-third of their computing power to safety research, a dramatic increase from current levels [21].
Since receiving the Nobel Prize, Hinton has used his elevated platform to reach broader audiences with his safety message. He has appeared on the CBS program 60 Minutes, Jon Stewart's The Weekly Show podcast, and numerous other media outlets [21]. His warnings have reached millions of people who had never previously engaged with AI safety discourse.
Hinton's stance has put him in direct conflict with some of his former colleagues and students. Most notably, Yann LeCun has publicly and repeatedly disagreed with Hinton's warnings, arguing that fears of rogue superintelligence are overblown and that AI systems can be designed to be safe and controllable [22]. The two clashed publicly over California's proposed AI safety bill SB 1047 in 2024, with Hinton supporting the legislation and LeCun opposing it [22]. Despite their disagreements on risk, both continue to express mutual respect for each other's scientific contributions.
At the Ai4 2025 conference, Hinton proposed a concept he called "nurturing instincts" for AI, suggesting that researchers should find ways to instill protective instincts in AI systems analogous to the care that parents naturally feel toward their children. The idea, which some media outlets characterized as a "mother AI" proposal, attracted both interest and skepticism [21].
As of 2026, Hinton holds the title of University Professor Emeritus at the University of Toronto's Department of Computer Science. He is a member of the advisory board of the Schwartz Reisman Institute for Technology and Society (SRI) at the University of Toronto, which serves as the institutional home base for his AI safety work. The Good Ventures foundation has provided support enabling Hinton to participate in global events and conversations about AI safety [21].
In 2025, Hinton was named one of seven recipients of the Queen Elizabeth Prize for Engineering, presented by King Charles III at St James's Palace. The other laureates were Yoshua Bengio, John Hopfield, Yann LeCun, Bill Dally, Jensen Huang, and Fei-Fei Li, all recognized for their pioneering contributions to modern machine learning [23].
Hinton is a Fellow of the Royal Society, a Fellow of the Royal Society of Canada, and a member of the Association for the Advancement of Artificial Intelligence. He holds honorary degrees from multiple universities.
Hinton has spoken publicly about his struggles with chronic back pain, which has affected him for much of his adult life and has made it difficult for him to sit for extended periods. For years, he worked standing up and avoided sitting on airplanes, which limited his ability to travel. He has used a standing desk for decades.
He became a Canadian citizen after moving to Toronto in 1987 and has lived in Canada for most of his professional career. He has credited Canada's public healthcare system and its supportive research funding environment as factors that allowed him to pursue long-term, high-risk research without the pressure to produce immediate results.
Hinton's influence on the field of AI is difficult to overstate. His students and postdocs have gone on to lead many of the most important AI research organizations in the world. Ilya Sutskever co-founded OpenAI and later Safe Superintelligence Inc. (SSI). Alex Krizhevsky's work on AlexNet helped spark the deep learning revolution. Ruslan Salakhutdinov became the director of AI research at Apple. Yann LeCun, who did a postdoc with Hinton at the University of Toronto in 1987-1988, went on to develop convolutional neural networks at Bell Labs and later led AI research at Meta.
The techniques Hinton developed or popularized, including backpropagation, Boltzmann machines, dropout, knowledge distillation, and deep belief networks, remain foundational building blocks of modern AI systems. Large language models like GPT-4 and Claude are built on deep neural networks trained with backpropagation. Image recognition systems use convolutional architectures that trace their lineage through AlexNet. Model compression for mobile deployment relies on knowledge distillation.
His career also illustrates the long timescales over which fundamental research can pay off. Hinton pursued neural network research for decades during periods when the approach was unpopular and poorly funded. The vindication came slowly, then all at once: the 2012 ImageNet breakthrough, the rapid adoption of deep learning across industry, and ultimately the Turing Award and the Nobel Prize.
Whether history will judge Hinton more for building the technology or for warning the world about it remains to be seen. He has said that he does not fully regret his life's work, but that he feels a responsibility to ensure that the technology is developed safely.