# Oriol Vinyals

> Source: https://aiwiki.ai/wiki/oriol_vinyals
> Updated: 2026-06-09
> Categories: AI Research, Google DeepMind, People
> From AI Wiki (https://aiwiki.ai), a free encyclopedia of artificial intelligence. Quote with attribution.

# Oriol Vinyals

**Oriol Vinyals** (born 1983, Sabadell, Catalonia, Spain) is a Spanish machine learning researcher who serves as Vice President of Research at [Google DeepMind](/wiki/google_deepmind) and co-technical lead of the [Gemini](/wiki/gemini) family of multimodal foundation models.[^1][^2] He is a co-author of the 2014 paper "Sequence to Sequence Learning with Neural Networks" that helped popularize neural encoder-decoder architectures for [machine translation](/wiki/machine_translation), and he led the AlphaStar project that achieved Grandmaster level in StarCraft II, reported on the cover of *Nature* in 2019.[^3][^4] Before joining DeepMind he was a researcher in the [Google Brain](/wiki/google_brain) team, and across his career he has contributed to seq2seq, Pointer Networks, Matching Networks, image captioning, [knowledge distillation](/wiki/knowledge_distillation), [AlphaStar](/wiki/alphastar), [AlphaFold](/wiki/alphafold), [AlphaCode](/wiki/alphacode), Flamingo, and the [Gemini](/wiki/gemini) series.[^1][^2] His Google Scholar profile lists more than 430,000 citations and an h-index of 114 as of 2026.[^2]

| Item | Details |
| --- | --- |
| Born | 1983, Sabadell, Catalonia, Spain[^5] |
| Nationality | Spanish[^5] |
| Education | Telecommunications engineering and mathematics, Universitat Politècnica de Catalunya (CFIS); MS in computer science, UC San Diego; PhD in EECS, UC Berkeley (2013)[^5][^6] |
| Doctoral advisor | Nelson Morgan (UC Berkeley EECS)[^6] |
| Dissertation | "Beyond Deep Learning: Scalable Methods and Models for Learning" (defended December 12, 2013)[^6] |
| Employer | [Google DeepMind](/wiki/google_deepmind) (Vice President of Research; co-technical lead, [Gemini](/wiki/gemini))[^1][^7] |
| Notable works | Seq2Seq (2014); Show and Tell (2015); Pointer Networks (2015); Matching Networks (2016); [AlphaStar](/wiki/alphastar) (2019); Flamingo (2022); [Gemini](/wiki/gemini) (2023 onward)[^2][^3][^4][^8][^9][^10][^11] |
| Honorary doctorate | Universitat Politècnica de Catalunya, conferred 26 November 2025[^5] |

## Early life and education

Vinyals was born in 1983 in Sabadell, a city in the Vallès Occidental county of Catalonia, Spain.[^5] He studied a double degree in telecommunications engineering and mathematics at the Universitat Politècnica de Catalunya (UPC) in Barcelona, enrolling in the interdisciplinary CFIS programme together with coursework at the Barcelona School of Telecommunications Engineering (ETSETB) and the School of Mathematics and Statistics (FME); he completed his undergraduate thesis at Carnegie Mellon University in 2006.[^5]

He then moved to the United States. He earned a master's degree in computer science at the University of California, San Diego, before transferring to the University of California, [Berkeley](/wiki/uc_berkeley) for doctoral work in Electrical Engineering and Computer Sciences (EECS).[^1][^5] His dissertation, "Beyond Deep Learning: Scalable Methods and Models for Learning," was advised by speech researcher Nelson Morgan and was filed in late 2013.[^6] The thesis focused on faster and more robust optimization for deep architectures, simpler deep models, theoretical bounds against shallow sparse-coding networks, and applications to acoustic modeling and object recognition.[^6] During his PhD he held research internships at Microsoft Research and Google.[^5]

## Career

### Google Brain (from 2013)

After defending his PhD, Vinyals joined the [Google Brain](/wiki/google_brain) team in Mountain View as a research scientist focused on deep learning for text, speech and vision.[^1][^5] In his Brain years he became one of the most visible figures in early sequence modeling. His Google Research profile lists him as principal scientist and team lead of the Deep Learning group, and notes that his early Brain work fed directly into TensorFlow, Google Translate, text-to-speech, and speech recognition products.[^1]

### Google DeepMind (2016 onward)

He moved to [DeepMind](/wiki/google_deepmind) in London in 2016, continuing as a research scientist and rising to Vice President of Research and head of the Deep Learning group.[^5] At DeepMind he led the AlphaStar effort on StarCraft II from roughly 2017 to 2019 and was a senior contributor to AlphaCode, AlphaFold, and Flamingo.[^2][^4][^10] Following the April 2023 merger of [Google Brain](/wiki/google_brain) and [DeepMind](/wiki/deepmind) into Google DeepMind, he became co-technical lead of the [Gemini](/wiki/gemini) project alongside [Noam Shazeer](/wiki/noam_shazeer) and [Jeff Dean](/wiki/jeff_dean), reporting up to DeepMind chief executive [Demis Hassabis](/wiki/demis_hassabis).[^5][^7]

He was named one of MIT Technology Review's "35 Innovators Under 35" in 2016 for the seq2seq line of work, and he served as program chair for the International Conference on Learning Representations (ICLR) in 2017 and 2018.[^1] In November 2025 the Universitat Politècnica de Catalunya conferred on him an honorary doctorate, held at the Vèrtex building on the North Diagonal Campus on 26 November 2025; he delivered a masterclass titled "From AI to AGI: The Quest for True Intelligence" two days later.[^5]

## Selected research contributions

### Sequence to sequence learning (2014)

Vinyals, with [Ilya Sutskever](/wiki/ilya_sutskever) and Quoc V. Le, published "Sequence to Sequence Learning with Neural Networks" at NeurIPS 2014.[^3] The paper showed that a stack of multilayer [LSTM](/wiki/lstm) networks could learn to map a variable-length input sequence to a vector and then decode an output sequence from that vector, all trained end-to-end with backpropagation.[^3] On the WMT'14 English-to-French translation benchmark the model reached a BLEU score of 34.8 on the entire test set, surpassing a phrase-based SMT baseline at 33.3, and, when used to rerank the SMT n-best list, reached 36.5 BLEU.[^3] The paper also reported that reversing the source sentence dramatically improved learning by introducing short-term dependencies between aligned tokens.[^3] On Google Scholar the paper has accumulated more than 32,000 citations.[^2] The seq2seq formulation became the template for neural [machine translation](/wiki/machine_translation) and, later, for many tasks that frame language generation as a conditional output sequence.

### Show and Tell: a neural image caption generator (2014-2015)

In November 2014 Vinyals, Alexander Toshev, Samy Bengio, and Dumitru Erhan posted "Show and Tell: A Neural Image Caption Generator," presented at CVPR 2015.[^8] The system combined a [convolutional neural network](/wiki/convolutional_neural_network) image encoder pretrained on ImageNet with an [LSTM](/wiki/lstm) language decoder, trained to maximize the likelihood of the reference caption.[^8] The model achieved BLEU-1 scores of 59 on Pascal (against a prior best of 25) and 66 on Flickr30k, and a BLEU-4 of 27.7 on the COCO captioning benchmark.[^8] Show and Tell is often cited as the canonical early example of using a sequence decoder conditioned on a [perception encoder](/wiki/perception_encoder), a recipe that recurs in modern multimodal systems.

### Pointer Networks (2015)

With Meire Fortunato and Navdeep Jaitly, Vinyals introduced "Pointer Networks" at NeurIPS 2015.[^9] Where standard [attention](/wiki/attention) blends encoder hidden states into a context vector, the Pointer Network uses attention as a pointer that selects an element of the input sequence as the output, accommodating variable-sized output dictionaries.[^9] The authors demonstrated the architecture on three combinatorial problems: planar convex hulls, Delaunay triangulations, and the symmetric planar Travelling Salesman Problem.[^9] Pointer Networks have since been re-used in extractive summarization, code generation, and neural combinatorial optimization.

### Matching Networks for one-shot learning (2016)

In 2016 Vinyals, Charles Blundell, Timothy Lillicrap, Koray Kavukcuoglu, and Daan Wierstra published "Matching Networks for One Shot Learning" at NeurIPS.[^11] The system learns an embedding plus a nearest-neighbor-style classifier that maps a small labeled support set and an unlabeled query to a label without fine-tuning.[^11] The episodic training procedure matches the meta-test conditions, a now-standard idea in few-shot learning. The paper reported improvements in one-shot accuracy on miniImageNet and Omniglot over previous baselines and helped open the meta-learning research line that followed.[^11]

### Knowledge distillation (2015)

Vinyals co-authored, with Geoffrey Hinton and Jeff Dean, "Distilling the Knowledge in a Neural Network," which formalized [knowledge distillation](/wiki/knowledge_distillation) as training a smaller student model to match the softened output probabilities of a larger teacher.[^2] The work, listed among Vinyals's most-cited papers with more than 32,000 citations, became the template for compressing neural models for deployment.[^2]

### AlphaStar (2017-2019)

Vinyals led the AlphaStar effort at DeepMind, which set out to play the real-time strategy game StarCraft II at human-professional level.[^4] AlphaStar combined supervised learning from human replays with multi-agent reinforcement learning in a "league" where new agents trained against a diverse population, including specialized exploiter agents designed to expose weaknesses.[^4] The system played the full game over the live online ladder for all three races (Protoss, Terran, Zerg) and reached Grandmaster level, placing within the top 0.2% of human players.[^4] The result, "Grandmaster level in StarCraft II using multi-agent reinforcement learning," appeared as the cover article of *Nature* on 30 October 2019, with Vinyals as first and corresponding author.[^4]

### Flamingo (2022)

In April 2022 Vinyals was a senior contributor on "Flamingo: a Visual Language Model for Few-Shot Learning," led by Jean-Baptiste Alayrac and colleagues at DeepMind.[^10] Flamingo bridged a frozen pretrained vision encoder and a frozen large language model (Chinchilla, with up to 70B parameters) using a Perceiver-style resampler and gated cross-attention layers, producing an 80B-parameter visual language model capable of handling arbitrarily interleaved images, videos, and text.[^10] The model set few-shot state-of-the-art results on captioning, visual question answering, and video understanding benchmarks at the time, often beating prior fully-finetuned task-specific models with only a handful of in-context examples.[^10] Flamingo is widely regarded as a key precursor of the multimodal language models that followed.

### Gemini (2023 onward)

Following the merger of [Google Brain](/wiki/google_brain) and [DeepMind](/wiki/deepmind) into [Google DeepMind](/wiki/google_deepmind) in April 2023, Vinyals was named co-technical lead of [Gemini](/wiki/gemini), Google's family of natively multimodal models, with [Noam Shazeer](/wiki/noam_shazeer) and [Jeff Dean](/wiki/jeff_dean).[^5][^7] The Gemini 1.0 generation, comprising Ultra, Pro and Nano variants, was announced by Sundar Pichai and [Demis Hassabis](/wiki/demis_hassabis) on 6 December 2023; the models were positioned as natively multimodal across text, code, audio, image, and video.[^12]

Three days after launch, Vinyals publicly addressed criticism that Google's "Hands-on with Gemini" demonstration video had been heavily edited, posting on X that "all the user prompts and outputs in the video are real, shortened for brevity," and explaining that the clip was meant to inspire developers about what could be built on Gemini's API rather than to depict a real-time interaction.[^13] He has continued to communicate Gemini progress publicly via talks and the *Google DeepMind: The Podcast* series, including episodes on Gemini 2.0 and agentic AI.[^7]

## Frequent collaborators

Vinyals has a long pattern of co-authorship with a small group of senior deep learning researchers. On the seq2seq paper he worked with [Ilya Sutskever](/wiki/ilya_sutskever) and Quoc V. Le.[^3] On knowledge distillation he worked with [Geoffrey Hinton](/wiki/geoffrey_hinton) and [Jeff Dean](/wiki/jeff_dean).[^2] His Gemini co-leadership is shared with [Jeff Dean](/wiki/jeff_dean) and [Noam Shazeer](/wiki/noam_shazeer), with [Demis Hassabis](/wiki/demis_hassabis) as DeepMind chief.[^5][^7] On the image captioning side his co-authors included Samy Bengio, Alexander Toshev, and Dumitru Erhan.[^8] AlphaStar was a large team effort; on the corresponding *Nature* paper Vinyals shares authorship with Igor Babuschkin, David Silver and many others.[^4]

## Bibliometric profile

The Google Scholar profile maintained at Vinyals's verified Google email lists him at Google DeepMind, with an h-index of 114 (101 since 2021), an i10-index of 207, and total citations of 430,981 (327,923 since 2021) as of mid-2026.[^2] Among his most-cited works the profile lists, in descending order: TensorFlow (2015), AlphaFold (2021), Distilling the Knowledge in a Neural Network (2015), Sequence to Sequence Learning with Neural Networks (2014), Representation Learning with Contrastive Predictive Coding (2018), Gemini (2023), and Flamingo (2022).[^2]

## Public communication and outreach

Vinyals is an active speaker outside of paper publication. He served as program chair for ICLR in 2017 and 2018 and as area chair for [NeurIPS](/wiki/neurips) and ICML.[^1] He has been profiled in popular and trade press including the IEEE Signal Processing Society's "Industry Leaders" series, which covers his UPC background and dissertation work on speech recognition.[^14] On the Google DeepMind podcast he has discussed the evolution of Gemini and agentic AI.[^7] He maintains an active presence on X under the handle @OriolVinyalsML, where he discusses model releases and pre-training versus post-training tradeoffs.[^13]

## Honors and recognition

- 2016: MIT Technology Review "Innovators Under 35".[^1]
- 2017 and 2018: Program chair, International Conference on Learning Representations (ICLR).[^1]
- 2019: Cover paper in *Nature* on AlphaStar's Grandmaster-level StarCraft II performance.[^4]
- 2025: Honorary doctorate, Universitat Politècnica de Catalunya, ceremony held 26 November 2025.[^5]

## Significance

Vinyals's career arc maps cleanly onto the deep learning era. The seq2seq paper helped end the dominance of phrase-based statistical translation and supplied the encoder-decoder template that, refined with attention by Bahdanau and others and then with the [Transformer](/wiki/transformer), underpins most modern language and multimodal systems.[^3] Show and Tell and Flamingo connect computer vision to language generation through the same encoder-decoder pattern, and the resulting line of vision-language work directly preceded today's [multimodal models](/wiki/multimodal_model).[^8][^10] Pointer Networks and Matching Networks broadened what neural sequence models could do, covering combinatorial outputs and few-shot classification.[^9][^11] [AlphaStar](/wiki/alphastar) was a landmark demonstration that multi-agent reinforcement learning could reach professional human level in a real-time strategy game with imperfect information.[^4] As co-technical lead of [Gemini](/wiki/gemini) he is now responsible for translating these research threads into a commercial multimodal model family.[^5][^7]

## Criticism

The most prominent public controversy around Vinyals concerns Google's "Hands-on with Gemini" promotional video accompanying the 6 December 2023 Gemini 1.0 launch, which critics argued misleadingly implied real-time conversational interaction with images and video.[^13] Vinyals personally addressed the criticism on X, conceding that the user prompts and outputs were "shortened for brevity" and that the video used still images plus text rather than the implied real-time video and voice interface; he framed the demo as inspiration for developers rather than a faithful recording.[^13] The episode prompted broader media discussion of whether the Gemini launch had been over-hyped.

## Related work

- [Gemini](/wiki/gemini) family of multimodal models, of which Vinyals is co-technical lead.[^5][^7]
- [AlphaStar](/wiki/alphastar), the StarCraft II project he led.[^4]
- [Knowledge distillation](/wiki/knowledge_distillation), for which he was a co-author on the canonical 2015 paper.[^2]
- [Ilya Sutskever](/wiki/ilya_sutskever) and [Jeff Dean](/wiki/jeff_dean), long-running collaborators.[^2][^3]
- [Google DeepMind](/wiki/google_deepmind) and the post-merger research organization he serves in.[^5]
- [Machine translation](/wiki/machine_translation), the application that motivated the seq2seq paper.[^3]

## See also

- [Gemini](/wiki/gemini)
- [Google DeepMind](/wiki/google_deepmind)
- [AlphaStar](/wiki/alphastar)
- [AlphaFold](/wiki/alphafold)
- [AlphaCode](/wiki/alphacode)
- [Demis Hassabis](/wiki/demis_hassabis)
- [Jeff Dean](/wiki/jeff_dean)
- [Ilya Sutskever](/wiki/ilya_sutskever)
- [Noam Shazeer](/wiki/noam_shazeer)
- [Geoffrey Hinton](/wiki/geoffrey_hinton)
- [Knowledge Distillation](/wiki/knowledge_distillation)
- [LSTM](/wiki/lstm)
- [Transformer](/wiki/transformer)
- [Vision language model](/wiki/vision_language_model)
- [Multimodal Model](/wiki/multimodal_model)
- [Bahdanau attention](/wiki/bahdanau_attention)
- [Chinchilla](/wiki/chinchilla)
- [University of California, Berkeley](/wiki/uc_berkeley)
- [Google Brain](/wiki/google_brain)
- [NeurIPS](/wiki/neurips)
- [Machine translation](/wiki/machine_translation)

## References

[^1]: Google Research, "Oriol Vinyals", Google Research staff page, 2024. https://research.google/people/oriolvinyals/. Accessed 2026-05-21.
[^2]: Google Scholar, "Oriol Vinyals", citations profile (Google DeepMind), 2026. https://scholar.google.com/citations?user=NkzyCvUAAAAJ&hl=en. Accessed 2026-05-21.
[^3]: Ilya Sutskever, Oriol Vinyals, Quoc V. Le, "Sequence to Sequence Learning with Neural Networks", arXiv:1409.3215 (NeurIPS 2014), 2014-09-10. https://arxiv.org/abs/1409.3215. Accessed 2026-05-21.
[^4]: Oriol Vinyals, Igor Babuschkin, Wojciech M. Czarnecki et al., "Grandmaster level in StarCraft II using multi-agent reinforcement learning", *Nature* 575 (7782): 350-354, 2019-10-30. https://www.nature.com/articles/s41586-019-1724-z. Accessed 2026-05-21.
[^5]: Universitat Politecnica de Catalunya, "The UPC will confer an honorary doctoral degree on mathematician and telecommunications engineer Oriol Vinyals", UPC press room, 2025. https://www.upc.edu/en/press-room/news/the-upc-will-confer-an-honorary-doctoral-degree-on-mathematician-and-telecommunications-engineer-oriol-vinyals. Accessed 2026-05-21.
[^6]: Oriol Vinyals, "Beyond Deep Learning: Scalable Methods and Models for Learning", PhD dissertation, EECS Department, UC Berkeley, Technical Report EECS-2013-202, 2013-12-12. https://www2.eecs.berkeley.edu/Pubs/TechRpts/2013/EECS-2013-202.html. Accessed 2026-05-21.
[^7]: Google DeepMind, "Google DeepMind: The Podcast, Gemini 2.0 and the evolution of agentic AI with Oriol Vinyals", deepmind.google podcast, 2024. https://deepmind.google/discover/the-podcast/gemini-20-and-the-evolution-of-agentic-ai-with-oriol-vinyals/. Accessed 2026-05-21.
[^8]: Oriol Vinyals, Alexander Toshev, Samy Bengio, Dumitru Erhan, "Show and Tell: A Neural Image Caption Generator", arXiv:1411.4555 (CVPR 2015), 2014-11-17. https://arxiv.org/abs/1411.4555. Accessed 2026-05-21.
[^9]: Oriol Vinyals, Meire Fortunato, Navdeep Jaitly, "Pointer Networks", arXiv:1506.03134 (NeurIPS 2015), 2015-06-09. https://arxiv.org/abs/1506.03134. Accessed 2026-05-21.
[^10]: Jean-Baptiste Alayrac, Jeff Donahue, Pauline Luc et al., "Flamingo: a Visual Language Model for Few-Shot Learning", arXiv:2204.14198 (NeurIPS 2022), 2022-04-29. https://arxiv.org/abs/2204.14198. Accessed 2026-05-21.
[^11]: Oriol Vinyals, Charles Blundell, Timothy Lillicrap, Koray Kavukcuoglu, Daan Wierstra, "Matching Networks for One Shot Learning", arXiv:1606.04080 (NeurIPS 2016), 2016-06-13. https://arxiv.org/abs/1606.04080. Accessed 2026-05-21.
[^12]: Will Douglas Heaven, "Google DeepMind's new Gemini model looks amazing, but could signal peak AI hype", MIT Technology Review, 2023-12-06. https://www.technologyreview.com/2023/12/06/1084471/google-deepminds-new-gemini-model-looks-amazing-but-could-signal-peak-ai-hype/. Accessed 2026-05-21.
[^13]: Maximilian Schreiner, "Gemini co-lead Oriol Vinyals addresses criticism of Google DeepMind's staged multimodal demo", The Decoder, 2023-12-09. https://the-decoder.com/gemini-co-lead-oriol-vinyals-addresses-criticism-of-google-deepminds-staged-multimodal-demo/. Accessed 2026-05-21.
[^14]: IEEE Signal Processing Society, "Industry Leaders in Signal Processing and Machine Learning: Dr. Oriol Vinyals", SPS Newsletter, 2022-02. https://signalprocessingsociety.org/newsletter/2022/02/industry-leaders-signal-processing-and-machine-learning-dr-oriol-vinyals. Accessed 2026-05-21.

