# Douwe Kiela

> Source: https://aiwiki.ai/wiki/douwe_kiela
> Updated: 2026-06-08
> Categories: AI Companies, People
> License: CC BY 4.0 (https://creativecommons.org/licenses/by/4.0/)
> From AI Wiki (https://aiwiki.ai), the free encyclopedia of artificial intelligence. Reuse freely with attribution to "AI Wiki (aiwiki.ai)".

Douwe Kiela (born 1986) is a Dutch-American computer scientist and entrepreneur best known for his work on [retrieval-augmented generation](/wiki/retrieval-augmented_generation) (RAG) and on multimodal [natural language processing](/wiki/natural_language_processing). He co-founded [Contextual AI](/wiki/contextual_ai) in 2023 and served as its chief executive officer until May 2026, when he joined [Google DeepMind](/wiki/google_deepmind) under a technology-licensing and hiring agreement between the two companies. Earlier he was a research lead at Facebook AI Research, the laboratory later known as [Meta AI](/wiki/meta_ai), where in 2020 he was the senior author of the paper that introduced RAG, and he subsequently served as head of research at [Hugging Face](/wiki/hugging_face). He is also an adjunct professor in the Symbolic Systems program at [Stanford University](/wiki/stanford_university).[1][2]

## Early life and education

Kiela was born in 1986 in Amsterdam, the Netherlands.[1] He studied at Utrecht University, earning a Bachelor of Science in Liberal Arts and Sciences with a double major in cognitive artificial intelligence and philosophy. He then completed a Master of Science in logic, awarded cum laude, at the University of Amsterdam's Institute for Logic, Language and Computation.[1][2]

He went on to earn an MPhil and a PhD in computer science from the [University of Cambridge](/wiki/university_of_cambridge), where his doctoral research focused on multimodal semantics: grounding the meaning of words in perceptual information such as images and sound rather than in text alone. During this period he published work on grounding semantics in visual, auditory and even olfactory perception, often in collaboration with the Cambridge computational linguist Stephen Clark.[2][3] This early interest in connecting language models to information beyond text recurred throughout his later research and commercial work.

## Research at Facebook AI Research

Kiela joined Facebook AI Research (FAIR) in 2016, initially as a postdoctoral researcher and later as a research scientist and research lead based in New York.[1][2] His work spanned multimodal learning, sentence and word representation, dialogue, and the evaluation of language systems. He was a co-author of "Supervised Learning of Universal Sentence Representations from Natural Language Inference Data" (2017), which introduced the widely used InferSent sentence-embedding method, and of "Personalizing Dialogue Agents" (2018), the paper behind the PersonaChat dataset for persona-grounded conversation.[3]

### Retrieval-augmented generation

Kiela's most influential contribution came in 2020, when he supervised the FAIR team that developed retrieval-augmented generation. In the paper "Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks," first posted in May 2020 and published at NeurIPS 2020, the authors combined a pretrained sequence-to-sequence generator with a neural retriever that pulls relevant passages from an external document collection at inference time.[4] By separating a model's parametric memory from a non-parametric knowledge store, RAG let systems draw on up-to-date sources, reduce factual errors and update their knowledge without retraining. Patrick Lewis was the first author and Kiela was the senior, last-listed author who led the project; the co-authors included Ethan Perez, Aleksandra Piktus and Sebastian Riedel.[4] RAG became one of the most cited ideas in modern NLP and a standard building block for enterprise AI systems, a fact central to the company Kiela later founded.

### Multimodal benchmarks and Dynabench

Kiela also led work on evaluation. He was the first author of "The Hateful Memes Challenge" (NeurIPS 2020), which released a dataset and competition for detecting hateful multimodal memes. The dataset was deliberately constructed with "benign confounders," examples in which the text or image alone looks harmless, so that models relying on a single modality would fail and genuine multimodal reasoning was required.[5] In the same period he led the creation of Dynabench, a platform for dynamic, human-in-the-loop adversarial benchmarking in which annotators try to fool current models, producing progressively harder test sets. The accompanying paper, "Dynabench: Rethinking Benchmarking in NLP," appeared at NAACL 2021 with Kiela as first author.[6]

## Hugging Face and Stanford

After leaving FAIR, Kiela became head of research at Hugging Face, the open-source machine-learning company.[1][2] There he contributed to BLOOM, the 176-billion-parameter open-access multilingual [large language model](/wiki/large_language_model) released in 2022 by the BigScience collaboration.[3] Alongside his industry roles he holds an appointment as an adjunct professor in Stanford University's Symbolic Systems program, where he has lectured on augmented language models and benchmarking and has contributed to teaching the program's natural language understanding course (CS224U).[2][7]

## Contextual AI

In 2023 Kiela left Hugging Face to co-found Contextual AI with Amanpreet Singh, a fellow researcher he had worked with at both Meta and Hugging Face.[1][8] The company emerged from stealth on June 7, 2023 with a 20 million dollar seed round led by Bain Capital Ventures, with participation from Lightspeed, Greycroft, SV Angel and angel investors.[8] Its stated mission was to build language models purpose-built for enterprises, addressing problems such as hallucination, stale knowledge and data privacy that make general-purpose chatbots hard to deploy on sensitive corporate data.

Contextual AI positioned its technology as "RAG 2.0," an end-to-end approach in which the retriever and generator are trained together as a single grounded system rather than assembled from separate off-the-shelf components.[9] In August 2024 the company raised an 80 million dollar Series A led by Greycroft, at a post-money valuation that PitchBook estimated at about 609 million dollars. New investors included Jeff Bezos's Bezos Expeditions, [Nvidia](/wiki/nvidia)'s NVentures, HSBC Ventures and Snowflake Ventures, which led the press to describe the firm as "Bezos-backed."[10] In January 2025 Contextual AI made its platform generally available, letting enterprises build specialized RAG agents, and it released a "grounded language model" that the company said reached 88 percent on the FACTS factuality benchmark, ahead of comparable systems from Google, [Anthropic](/wiki/anthropic) and [OpenAI](/wiki/openai).[9] Customers cited by the company included the chipmaker [Qualcomm](/wiki/qualcomm) and other large enterprises.

## Google DeepMind

In May 2026 Google DeepMind reached an agreement with Contextual AI under which it hired more than 20 of the startup's researchers, including Kiela, and licensed its technology on a non-exclusive basis. Bloomberg, which reported the deal on May 19, 2026, put its value at roughly 80 to 100 million dollars. Kiela joined DeepMind as a research scientist director, and Jay Chen became interim chief executive of Contextual AI, which remained an independent company.[11] Commentators described the arrangement as another example of the "acqui-hire" structure used by large technology firms, similar to Google's earlier deals with Character.AI and Windsurf, that brings in talent and technology licenses while avoiding a formal acquisition that would draw antitrust scrutiny.[11]

## Recognition

Kiela is among the most cited researchers in natural language processing. As of 2026 his Google Scholar profile listed more than 55,000 citations, an h-index of 71 and an i10-index of 119, led by the RAG paper, which alone had been cited more than 22,000 times.[3] He is a frequent speaker and podcast guest on retrieval, grounding and enterprise AI, and his career is often cited as an example of a research idea from an industrial lab being carried directly into a startup and then back into a major laboratory.

## Selected publications

Citation figures are approximate and drawn from Google Scholar.[3]

| Year | Publication | Venue | Known for |
| --- | --- | --- | --- |
| 2017 | Supervised Learning of Universal Sentence Representations (InferSent) | EMNLP | General-purpose sentence embeddings from NLI data |
| 2018 | Personalizing Dialogue Agents (PersonaChat) | ACL | Persona-grounded conversational dataset |
| 2020 | Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks | NeurIPS | Introduced RAG |
| 2020 | The Hateful Memes Challenge | NeurIPS | Multimodal hate-speech detection benchmark |
| 2021 | Dynabench: Rethinking Benchmarking in NLP | NAACL | Dynamic adversarial benchmarking |
| 2022 | BLOOM | Preprint | Open-access 176B multilingual LLM |

## References

1. Wikipedia, "Douwe Kiela," accessed 2026.
2. Douwe Kiela, personal website, douwekiela.github.io (accessed 2026).
3. Google Scholar, author profile: Douwe Kiela (accessed 2026).
4. Patrick Lewis, Ethan Perez, Aleksandra Piktus, et al. (Douwe Kiela, senior author), "Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks," NeurIPS 2020 (arXiv:2005.11401).
5. Douwe Kiela, Hamed Firooz, Aravind Mohan, et al., "The Hateful Memes Challenge: Detecting Hate Speech in Multimodal Memes," NeurIPS 2020 (arXiv:2005.04790).
6. Douwe Kiela, et al., "Dynabench: Rethinking Benchmarking in NLP," NAACL 2021.
7. Stanford University, Symbolic Systems Program, "SSP Forum: Douwe Kiela on augmented language models"; Stanford CS224U Natural Language Understanding course materials.
8. "Contextual AI Emerges From Stealth to Build the Next Generation of Language Models, for the Enterprise," BusinessWire, June 7, 2023; Contextual AI, "Announcing our $20m seed round."
9. Contextual AI, "Introducing RAG 2.0"; SiliconANGLE, "Contextual AI launches RAG 2.0 platform to aid in the development of domain-specific AI agents," January 15, 2025; VentureBeat coverage of the grounded language model and FACTS benchmark.
10. "Contextual AI Raises $80 Million for Model-Enhancing Technique," Reuters, August 1, 2024; Greycroft, "Expanding our investment in Contextual AI"; PitchBook valuation estimate.
11. "Google Hires Staff From Bezos-Backed Contextual AI in Licensing Deal," Bloomberg, May 19, 2026.