Karén Simonyan
Last reviewed
Jun 8, 2026
Sources
11 citations
Review status
Source-backed
Revision
v1 · 1,483 words
Improve this article
Add missing citations, update stale details, or suggest a clearer explanation.
Last reviewed
Jun 8, 2026
Sources
11 citations
Review status
Source-backed
Revision
v1 · 1,483 words
Add missing citations, update stale details, or suggest a clearer explanation.
Karén Simonyan (his academic papers appear under the spelling Karen Simonyan) is a British computer scientist and artificial intelligence researcher who serves as Chief Scientist of Microsoft AI. He is best known as the co-author, with Andrew Zisserman, of the 2014 VGGNet architecture, one of the most influential and widely cited works in modern computer vision, and for a string of landmark systems built during his years at Google DeepMind, including WaveNet, AlphaGo Zero, and AlphaZero. In 2022 he co-founded Inflection AI with Mustafa Suleyman and Reid Hoffman, and in 2024 he moved with most of that team to Microsoft, where he leads the science behind the company's in-house MAI models. [1][2]
Simonyan studied at the University of Oxford, where he completed a doctorate (DPhil) in 2013 within the Visual Geometry Group (VGG). His thesis, "Large-Scale Learning of Discriminative Image Representations," was supervised by Andrew Zisserman and Antonio Criminisi, the latter then at Microsoft Research in Cambridge. His early research concerned image representations and descriptors, the building blocks that visual recognition systems use to compare and classify images. He stayed on at Oxford as a postdoctoral researcher after finishing his PhD, a period in which he produced much of the work for which he first became known. [9][10]
In 2014 Simonyan and Zisserman published "Very Deep Convolutional Networks for Large-Scale Image Recognition," the paper that introduced the architecture universally known as VGG or VGGNet. Its central finding was that pushing a convolutional neural network to 16 or 19 weight layers, while using only small 3x3 convolution filters stacked on top of one another, produced large gains in accuracy. The design was strikingly simple and uniform, which made it easy to understand, reproduce, and adapt to new tasks. Entered in the ImageNet Large Scale Visual Recognition Challenge 2014, the team's networks took first place in the localization track and second place in classification. The paper was released on arXiv in September 2014 and presented at the International Conference on Learning Representations in 2015. It went on to become one of the most cited papers in all of computer science, with well over 100,000 citations, and the pre-trained VGG-16 and VGG-19 models became standard backbones and feature extractors used throughout the field. [3][5]
The same period produced several other influential results. With Zisserman he introduced "Two-Stream Convolutional Networks for Action Recognition in Videos" at NeurIPS 2014, which split video understanding into a spatial stream over still frames and a temporal stream over optical flow, an approach that shaped years of subsequent work on action recognition. He co-authored an early interpretability paper, "Deep Inside Convolutional Networks," which used gradient-based saliency maps to visualize what an image classifier responds to, as well as the BMVC 2014 best-paper winner "Return of the Devil in the Details." Around this time Simonyan co-founded Vision Factory, an Oxford spin-out applying deep learning to visual recognition. The company was acquired by Google DeepMind in October 2014. [4][9][10]
Through the Vision Factory acquisition, Simonyan joined DeepMind in late 2014 and became a Principal Research Scientist, where he created and led the laboratory's deep learning scaling team. Over the following years he contributed to a remarkable run of systems that helped define the era. He was a co-author on WaveNet (2016), the raw-audio generative model that set a new standard for text-to-speech and later powered Google Assistant voices, and on Spatial Transformer Networks (2015), a widely used module that lets a network learn to spatially warp its inputs. He contributed to the AlphaGo Zero and AlphaZero systems (2017), which reached superhuman play in Go, chess, and shogi purely through self-play reinforcement learning, and to BigGAN (2018), then the state of the art in high-fidelity image generation. The output of his scaling team also fed into later DeepMind milestones, including the StarCraft II agent AlphaStar and the protein-structure-prediction system AlphaFold. [1][9][11]
In 2022 Simonyan left DeepMind to co-found Inflection AI with Mustafa Suleyman, a DeepMind co-founder, and Reid Hoffman, the co-founder of LinkedIn. Simonyan served as the company's Chief Scientist, leading model development, while Suleyman was chief executive. Inflection positioned itself around the idea of personal AI. In May 2023 it launched Pi, short for "personal intelligence," a chatbot designed for empathetic, conversational interaction rather than purely transactional question answering. The underlying models, branded Inflection-1, Inflection-2, and the March 2024 Inflection-2.5, were built to be competitive with the leading large language models of the time. In June 2023 the startup announced $1.3 billion in new funding led by Microsoft, NVIDIA, and existing investors, valuing it at around $4 billion and making it one of the best-capitalized AI startups in the world. [7][8]
In March 2024 Microsoft announced that Suleyman would join as Executive Vice President and CEO of a new consumer-facing division called Microsoft AI, with Simonyan joining as Chief Scientist and reporting to him. The pair brought most of Inflection's roughly 70 employees with them, in an unusual arrangement under which Microsoft hired the team and licensed Inflection's technology rather than buying the company outright. Reporting indicated Microsoft paid Inflection roughly $650 million in licensing and related fees, and Inflection continued as an independent business under a new chief executive. [6][7]
At Microsoft, Simonyan leads the research behind the division's in-house MAI models. In August 2025 Microsoft AI unveiled its first two: MAI-Voice-1, an expressive speech-generation model fast enough to produce a minute of audio in under a second on a single GPU, and MAI-1-preview, the company's first foundation model trained end to end in house. MAI-1-preview uses a mixture-of-experts design and was trained on roughly 15,000 NVIDIA H100 GPUs, debuting on the public LMArena leaderboard before rolling into Copilot experiences. The MAI family expanded through 2026 with additional text, image, voice, and reasoning models such as MAI-Image-2, MAI-Transcribe-1, and MAI-Thinking-1. [2]
On November 6, 2025, Microsoft announced the MAI Superintelligence Team, a new group led by Suleyman and dedicated to what the company calls humanist superintelligence, defined as advanced AI capabilities that work in service of people rather than as an end in themselves. Simonyan is the team's chief scientist. The unit draws researchers from DeepMind, Google, Meta, OpenAI, and Anthropic alongside the former Inflection contingent, and Microsoft framed it as a deliberately measured stance, with Suleyman stating that the company rejects "narratives about a race to AGI." As of 2026 Simonyan continues as Chief Scientist of Microsoft AI. [1][2]
| Year | Work | Simonyan's role |
|---|---|---|
| 2014 | VGGNet ("Very Deep Convolutional Networks") | Co-author; deep network of stacked 3x3 filters, ImageNet 2014 winner in localization |
| 2014 | Two-Stream Convolutional Networks | Co-author; spatial and temporal streams for video action recognition |
| 2014 | "Deep Inside Convolutional Networks" | Co-author; gradient saliency maps for model interpretability |
| 2015 | Spatial Transformer Networks | Co-author (DeepMind) |
| 2016 | WaveNet | Co-author; generative raw-audio model for speech synthesis |
| 2017 | AlphaGo Zero and AlphaZero | Contributor; self-play reinforcement learning |
| 2018 | BigGAN | Co-author; high-fidelity image generation |
| 2023 | Inflection-1 and Pi | Chief Scientist; led model development |
| 2025 | MAI-1-preview and MAI-Voice-1 | Chief Scientist; first Microsoft AI in-house models |
Simonyan's research is among the most cited in machine learning. His Google Scholar profile lists more than 300,000 citations, driven above all by the VGG paper, which ranks among the most cited works in computer science. His Oxford-era papers collected honors including the BMVC 2014 best-paper prize, and several of the architectures and techniques he helped create, from VGG and two-stream networks to WaveNet and AlphaZero, are now taught as standard material in deep learning courses. [5][9]