Fei-Fei Li

Fei-Fei Li (born 1976) is a Chinese-American computer scientist, the inaugural Sequoia Professor of Computer Science at Stanford University, and co-director of the Stanford Institute for Human-Centered Artificial Intelligence (HAI). She is best known for creating ImageNet, the large-scale visual database that catalyzed the deep learning revolution in computer vision during the 2010s. Often referred to as the "Godmother of AI," Li has shaped the trajectory of artificial intelligence through her research, institution-building, public advocacy, and entrepreneurial ventures. In 2024, she co-founded World Labs, a startup focused on spatial intelligence that reached a valuation of roughly $5 billion by early 2026.

Early Life and Immigration

Fei-Fei Li was born in Beijing, China, in 1976 and spent much of her childhood in Chengdu, the capital of Sichuan province ^[1]. When she was twelve, her father emigrated to Parsippany, New Jersey, to seek better opportunities for the family. Four years later, in 1992, sixteen-year-old Li and her mother followed him to the United States, arriving with limited English and very little money ^[2].

The transition was difficult. Li worked in restaurants and at her parents' dry-cleaning business after school to help the family stay afloat, all while learning English and keeping up with her coursework at Parsippany High School ^[2]. Despite these challenges, she excelled academically. Her teachers recognized her aptitude in mathematics and science, and she graduated from Parsippany High School in 1995. Her immigrant experience would later become a central theme in her memoir and her advocacy for making AI more inclusive.

Education

Li earned a Bachelor of Arts in physics from Princeton University in 1999 ^[3]. At Princeton, she developed an interest in the computational aspects of perception and cognition, which would guide the rest of her career. She received a Paul and Daisy Soros Fellowship for New Americans, a scholarship supporting immigrants and children of immigrants pursuing graduate education in the United States ^[4].

In 2000, Li began doctoral studies at the California Institute of Technology (Caltech), working at the intersection of computer science, electrical engineering, and cognitive neuroscience. Her PhD research explored how the human visual system processes and categorizes scenes, bridging computational models with biological insights. She received her PhD in electrical engineering from Caltech in 2005 ^[1].

Detail	Information
Full Name	Fei-Fei Li
Born	1976, Beijing, China
Nationality	Chinese-American
Undergraduate	Princeton University (B.A. Physics, 1999)
Doctorate	California Institute of Technology (Ph.D. Electrical Engineering, 2005)
Institution	Stanford University
Known For	ImageNet, Stanford HAI, World Labs
Notable Role	Google Cloud Chief Scientist of AI/ML (2017-2018)
Google Scholar Citations	~340,000+
Publications	400+ scientific articles

Academic Career at Stanford

After completing her PhD, Li held positions at the University of Illinois at Urbana-Champaign and Princeton University before joining Stanford University in 2009 as an assistant professor ^[1]. She rose through the ranks to become the Sequoia Professor of Computer Science, one of the most prestigious endowed chairs in the department.

At Stanford, Li directs the Stanford Vision and Learning Lab (SVL), which has produced influential research in image recognition, object detection, visual reasoning, and embodied AI. She also served as the director of the Stanford Artificial Intelligence Lab (SAIL) from 2013 to 2018, one of the oldest and most respected AI research labs in the world ^[5].

Stanford Vision and Learning Lab (SVL)

The Stanford Vision and Learning Lab, under Li's direction, has been one of the most productive computer vision research groups in the world. The lab has contributed to a wide range of research areas:

Research Area	Key Contributions	Notable Projects
Image classification	Foundational work on large-scale visual recognition	ImageNet, visual categorization models
Object detection	Real-time object localization and identification	Faster detection architectures
Image captioning	Generating natural language descriptions of images	DenseCap, Visual Genome
Visual question answering	Answering questions about image content	VQA models and benchmarks
Video understanding	Temporal reasoning in video sequences	Activity recognition, action detection
Embodied AI	Intelligent agents in simulated environments	BEHAVIOR benchmark
3D scene understanding	Spatial reasoning and scene reconstruction	3D scene graphs

The lab's work on the Visual Genome dataset, published in 2017, created a structured knowledge base connecting objects, attributes, and relationships within images. This dataset has been widely used for training models that go beyond simple classification to reason about the relationships between objects in a scene ^[5].

Stanford Institute for Human-Centered Artificial Intelligence (HAI)

In 2019, Li co-founded the Stanford Institute for Human-Centered Artificial Intelligence (HAI) alongside philosopher John Etchemendy. The institute promotes a vision of AI development that places human well-being at the center of technological progress. HAI conducts interdisciplinary research, publishes an annual AI Index report tracking global AI trends, and engages with policymakers on AI governance ^[5]. Li serves as a founding co-director and has used the platform to advocate for thoughtful regulation and broad participation in shaping AI's future.

The HAI AI Index report has become one of the most widely cited sources of data on AI trends, tracking metrics such as research publication volume, AI investment, technical performance benchmarks, and policy developments across countries.

ImageNet and the Deep Learning Revolution

Li's most transformative contribution to the field is ImageNet, a massive visual database she conceived in 2006 and formally launched in 2009. The project arose from her conviction that the field of computer vision was being held back not by insufficient algorithms but by a lack of high-quality, large-scale training data ^[6].

The Insight Behind ImageNet

In the mid-2000s, the standard benchmark datasets in computer vision contained only a few thousand images spread across a handful of categories. Li believed this was fundamentally inadequate. She drew inspiration from cognitive science research showing that children learn to recognize objects by being exposed to millions of examples in varied contexts over the course of development. If human visual intelligence required massive exposure to visual data, she reasoned, machines would need the same ^[6].

At the time, this was a contrarian view. Most computer vision researchers focused on developing better algorithms and evaluated them on small datasets like Caltech-101 (which contained about 9,000 images across 101 categories). The idea that simply providing more data would drive progress was met with skepticism from many in the field.

Building the Dataset: Crowdsourcing at Scale

ImageNet organized over 14 million images into more than 20,000 categories based on the WordNet lexical hierarchy. The scale of the labeling task was staggering, and Li's team developed an innovative approach using Amazon Mechanical Turk (MTurk) to crowdsource annotations.

ImageNet Construction Detail	Specification
Total images	14+ million
Categories (synsets)	20,000+
Annotation method	Amazon Mechanical Turk crowdsourcing
Workers involved	~49,000 from 167 countries
Candidate images filtered	160+ million
Labels per image	3 (for quality assurance)
Ontology source	WordNet lexical database
Time to initial launch	~2.5 years (2006-2009)

The crowdsourcing process was carefully designed to ensure quality. Workers ("Turkers") were shown candidate images and asked to decide whether each image represented a given word from the WordNet ontology. Multiple workers labeled each image, and statistical methods were used to resolve disagreements and filter out low-quality annotations. Li's team implemented several quality control measures, including gold-standard images with known correct labels that were interspersed among the labeling tasks to identify unreliable workers ^[6].

The effort was enormous. Between July 2008 (when the project had zero labeled images) and December 2008, the team categorized three million images across more than 6,000 synsets. By the time of the formal launch in 2009, ImageNet contained 3.2 million labeled images. The dataset continued to grow in subsequent years, eventually reaching over 14 million images ^[6].

The budget for the project was modest by academic standards, with much of the annotation cost kept low through the Mechanical Turk platform. Li later described the decision to use crowdsourcing as born partly of necessity: traditional methods of labeling by graduate students would have taken decades.

The ImageNet Large Scale Visual Recognition Challenge

Starting in 2010, Li organized the ImageNet Large Scale Visual Recognition Challenge (ILSVRC), an annual competition that tasked researchers with building the best image classifiers. The challenge became a defining benchmark in AI. In 2012, a team led by Alex Krizhevsky, Ilya Sutskever, and Geoffrey Hinton entered AlexNet, a convolutional neural network that slashed the error rate nearly in half compared to previous methods ^[7]. This result is widely considered the moment that deep learning proved its superiority for visual tasks, triggering a wave of investment and research that transformed the entire AI field.

Legacy of ImageNet

The ripple effects of ImageNet extended far beyond image classification. The dataset and its associated challenge demonstrated the power of combining large datasets with deep neural networks, a paradigm that now underlies advances in natural language processing, speech recognition, robotics, and many other domains. Li's insistence on scale and data quality proved prescient, and ImageNet remains one of the most cited datasets in AI research.

Year	ImageNet Milestone
2006	Fei-Fei Li conceives the ImageNet project
2009	ImageNet formally launched with 3.2 million labeled images
2010	First ILSVRC competition held
2012	AlexNet wins ILSVRC, sparking the deep learning revolution
2014	GoogLeNet and VGGNet push error rates below 7%
2015	ResNet achieves superhuman performance (3.57% error rate)
2017	Final ILSVRC competition held; task considered largely solved

Google Cloud Chief Scientist (2017-2018)

During a sabbatical from Stanford, Li served as Vice President at Google and Chief Scientist of AI/ML at Google Cloud from January 2017 to September 2018 ^[8]. In this role, she led efforts to democratize AI tools for enterprise customers, making machine learning capabilities more accessible to businesses without deep technical expertise. She oversaw the development of Google Cloud's AutoML products, which allow users to train custom machine learning models with minimal coding.

Her tenure at Google also highlighted the tensions that arise when academic ideals meet corporate imperatives. Li was involved in internal discussions about Google's AI ethics and public responsibility, experiences that reinforced her commitment to human-centered approaches to AI development ^[1].

Key Research Contributions

Beyond ImageNet, Li's research portfolio spans several major areas of AI.

Key Publications

Li is one of the most cited computer scientists of her generation, with over 340,000 citations on Google Scholar and more than 400 published scientific articles. Her most influential publications include:

Publication	Year	Key Contribution	Citations
"ImageNet: A Large-Scale Hierarchical Image Database" (with J. Deng et al.)	2009	Introduced the ImageNet dataset and benchmark	40,000+
"One-Shot Learning of Object Categories" (with R. Fergus, P. Perona)	2006	Pioneered learning from very few examples	4,000+
"ImageNet Large Scale Visual Recognition Challenge" (with O. Russakovsky et al.)	2015	Defined the ILSVRC benchmark and surveyed progress	20,000+
"Visualizing and Understanding Recurrent Networks" (with A. Karpathy)	2015	Analyzed how recurrent networks process sequential data	3,000+
"DenseCap: Fully Convolutional Localization Networks for Dense Captioning" (with J. Johnson et al.)	2016	Dense image captioning at region level	1,500+
"Visual Genome: Connecting Language and Vision Using Crowdsourced Dense Image Annotations" (with R. Krishna et al.)	2017	Structured visual knowledge base	5,000+

Visual Recognition and Understanding

Li's lab has produced foundational work on image classification, object detection, image captioning, and visual question answering. Her group contributed to the development of models that can describe the content of an image in natural language, bridging computer vision and natural language processing ^[5].

Embodied AI and Robotics

More recently, Li's research has expanded into embodied AI, the idea that intelligent systems must interact with the physical world to develop genuine understanding. Her Stanford lab developed the BEHAVIOR benchmark, which evaluates AI agents' ability to plan and execute household tasks in simulated environments, covering up to 1,000 different activities ^[9]. The People, AI & Robots (PAIR) group within her lab focuses on generalizable robot perception and control.

Healthcare AI

Li has also pursued applications of computer vision in healthcare through the Partnership in AI-Assisted Care (PAC), a collaboration between Stanford's School of Medicine and Computer Science department. This work uses computer vision and machine learning to monitor patients in clinical settings, detect falls, and support clinicians with real-time information. The system uses ambient sensors rather than wearable devices, preserving patient dignity while providing continuous monitoring ^[5].

Spatial Intelligence

Beginning in 2024, Li articulated a new research direction she calls "spatial intelligence," which concerns AI systems that can perceive, generate, reason within, and interact with three-dimensional environments. This concept underpins her startup World Labs and represents what she sees as the next major frontier after the breakthroughs in language and 2D vision ^[10].

Li has described spatial intelligence as the bridge between the 2D understanding that current AI systems excel at and the 3D physical world that humans navigate effortlessly. In her framing, language models understand words, vision models understand images, but spatial intelligence enables understanding of environments, volumes, physics, and embodied interaction.

World Labs

In 2024, Li co-founded World Labs alongside Justin Johnson, Christoph Lassner, and Ben Mildenhall ^[11]. The company's mission is to build AI systems with spatial intelligence: the capacity to understand, generate, and interact with 3D worlds.

World Labs develops what it calls Large World Models (LWMs), a new class of AI systems that go beyond the text-based large language models (LLMs) that have dominated recent AI progress. While LLMs process and generate language, LWMs are designed to perceive and reason about three-dimensional space, enabling applications in architecture, urban planning, gaming, film production, and scientific simulation ^[11].

Marble: The First Product

In November 2025, World Labs launched its first commercial product, Marble, which generates 3D virtual worlds from image or text prompts. Marble allows users to create explorable three-dimensional scenes from a single photograph or a text description, representing a practical application of the spatial intelligence research Li has championed ^[12].

The company attracted significant investor interest. In February 2026, World Labs raised $1 billion in a new funding round, signaling a major shift in venture capital attention from text-based AI to spatial and 3D AI systems. Reports from January 2026 placed the company's valuation at approximately $5 billion ^[12].

Detail	World Labs
Founded	2024
Co-founders	Fei-Fei Li, Justin Johnson, Christoph Lassner, Ben Mildenhall
Focus	Spatial intelligence, Large World Models
First product	Marble (launched November 2025)
Funding (Feb 2026)	$1 billion raised
Estimated Valuation	~$5 billion (Jan 2026)
Key investors	Andreessen Horowitz, Radical Ventures, and others

Memoir: "The Worlds I See"

In November 2023, Li published her memoir, The Worlds I See: Curiosity, Exploration and Discovery at the Dawn of AI, with Macmillan Publishers ^[13]. The book traces her journey from a teenage immigrant struggling to learn English in New Jersey to one of the most influential figures in artificial intelligence. It recounts the creation of ImageNet, her time at Google, and her growing conviction that AI must be developed with humanity's broader interests in mind.

The memoir received praise for its candid portrayal of the personal sacrifices behind scientific achievement and for making complex technical concepts accessible to a general audience. NPR, Kirkus Reviews, and other outlets highlighted the book's thoughtful exploration of AI ethics and the immigrant experience in American science ^[14].

Advocacy for Diversity in AI

Throughout her career, Li has been a vocal proponent of diversity and inclusion in technology. In 2017, she co-founded AI4ALL, a nonprofit organization dedicated to increasing diversity in the AI field, particularly among women, underrepresented minorities, and students from low-income backgrounds ^[15]. AI4ALL runs summer programs, supports AI education in underserved communities, and works to ensure that the people building AI systems reflect the diversity of the people who will be affected by them.

Li has spoken publicly about how her own experience as an immigrant woman in a male-dominated field shaped her understanding of the biases that can creep into technology when its creators lack diverse perspectives. She has testified before the U.S. Congress on the importance of inclusive AI development and has engaged with policymakers around the world ^[1].

Awards and Honors

Li has received extensive recognition for her contributions to computer science and AI. The following table lists selected major awards:

Year	Award or Honor	Awarding Organization
2006	Microsoft Research New Faculty Fellowship	Microsoft
2009	NSF CAREER Award	National Science Foundation
2011	Alfred Sloan Faculty Award	Sloan Foundation
2016	IAPR J.K. Aggarwal Prize	International Association for Pattern Recognition
2016	IEEE PAMI Mark Everingham Award	IEEE
2018	ACM Fellow	Association for Computing Machinery
2019	IEEE PAMI Longuet-Higgins Prize	IEEE
2019	National Geographic Society Further Award	National Geographic
2020	Elected to National Academy of Engineering	NAE
2020	Elected to National Academy of Medicine	NAM
2021	Elected to American Academy of Arts and Sciences	AAAS
2022	IEEE PAMI Thomas Huang Memorial Prize	IEEE
2023	Intel Lifetime Achievements Award	Intel
2024	VinFuture Grand Prize	VinFuture Foundation
2025	Queen Elizabeth Prize for Engineering	QEPrize Foundation
2025	Time "Architects of AI" (Person of the Year)	Time Magazine
2025	Yale Honorary Degree (Doctor of Engineering & Technology)	Yale University

The Queen Elizabeth Prize for Engineering, awarded in 2025, recognized Li alongside six other innovators for "groundbreaking contributions to modern machine learning." The group shares the 500,000-pound prize. Li was presented the award by King Charles III at a ceremony where her contributions through ImageNet were specifically cited as instrumental to the deep learning revolution ^[16].

Current Work (2025-2026)

As of early 2026, Li continues to hold her position as Sequoia Professor at Stanford and co-director of HAI, though she has been on partial academic leave since January 2024 to focus on World Labs ^[1]. Her current work is centered on advancing spatial intelligence, both through World Labs' commercial products and through her academic research at Stanford.

At Stanford, her lab continues to work on embodied AI, healthcare applications, and the BEHAVIOR benchmark for evaluating AI agents in simulated environments. Through HAI, she remains engaged in AI policy discussions, including questions about AI governance, safety, and the societal impact of increasingly capable AI systems.

World Labs, following its $1 billion funding round in February 2026, is scaling its team and developing commercial applications of Large World Models. Li has described spatial intelligence as the necessary next step for AI to move from understanding text and flat images to comprehending and interacting with the physical world as humans do ^[10].

Personal Life

Li is married to Silvio Savarese, a computer scientist who also works on computer vision and AI. They have children together. Li has spoken about the challenges of balancing a demanding academic and entrepreneurial career with family life, a theme she explores in her memoir ^[13].

References

"Fei-Fei Li." Wikipedia. https://en.wikipedia.org/wiki/Fei-Fei_Li
"Fei-Fei Li: A Candid Look at a Young Immigrant's Rise to AI Trailblazer." Stanford HAI. https://hai.stanford.edu/news/fei-fei-li-candid-look-young-immigrants-rise-ai-trailblazer
"The Godmother of AI." Princeton Alumni. https://alumni.princeton.edu/stories/fei-fei-li-woodrow-wilson-award
"Fei-Fei Li." Paul & Daisy Soros Fellowships for New Americans. https://pdsoros.org/fellows/fei-fei-li/
"Fei-Fei Li's Profile." Stanford Profiles. https://profiles.stanford.edu/fei-fei-li
"The data that transformed AI research -- and possibly the world." Quartz. https://qz.com/1034972/the-data-that-changed-the-direction-of-ai-research-and-possibly-the-world
"AI Pioneer Fei-Fei Li Has a Vision for Computer Vision." IEEE Spectrum. https://spectrum.ieee.org/fei-fei-li-world-labs
"Fei-Fei Li." Stanford University School of Engineering. https://engineering.stanford.edu/people/fei-fei-li
"Stanford Vision Lab." Stanford University. http://vision.stanford.edu/research.html
"From Words to Worlds: Spatial Intelligence is AI's Next Frontier." Fei-Fei Li, Substack. https://drfeifei.substack.com/p/from-words-to-worlds-spatial-intelligence
"World Labs' $1.25B Business Model." FourWeekMBA. https://fourweekmba.com/world-labs-1-25b-business-model-how-fei-fei-li-is-building-ai-that-understands-3d-space-like-humans-do/
"Fei-Fei Li's World Labs Secures $1 Billion in Funding." Creati.ai. https://creati.ai/ai-news/2026-02-19/fei-fei-li-world-labs-1-billion-funding-spatial-intelligence-ai/
"The Worlds I See." Macmillan Publishers. https://us.macmillan.com/books/9781250897930/theworldsisee/
"Fei-Fei Li's memoir ponders artificial intelligence ethics." NPR. https://www.npr.org/2023/11/10/1198908536/fei-fei-li-the-worlds-i-see-ai-computer-vision
"Fei-Fei Li." Yale 2025 Honorary Degrees. https://yale2025.yale.edu/honorary-degrees/fei-fei-li
"Fei-Fei Li wins Queen Elizabeth Prize for Engineering." Stanford Report. https://news.stanford.edu/stories/2025/11/fei-fei-li-queen-elizabeth-prize-engineering
"Fei-Fei Li." Britannica. https://www.britannica.com/biography/Fei-Fei-Li
"ImageNet: Crowdsourcing, Benchmarking and Other Cool Things." Carnegie Mellon Robotics Institute. https://www.ri.cmu.edu/event/imagenet-crowdsourcing-benchmarking-and-other-cool-things/
"Li Fei-Fei." Google Scholar. https://scholar.google.com/citations?user=rDfyQnIAAAAJ

Early Life and Immigration

Education

Academic Career at Stanford

Stanford Vision and Learning Lab (SVL)

Stanford Institute for Human-Centered Artificial Intelligence (HAI)

ImageNet and the Deep Learning Revolution

The Insight Behind ImageNet

Building the Dataset: Crowdsourcing at Scale

The ImageNet Large Scale Visual Recognition Challenge

Legacy of ImageNet

Google Cloud Chief Scientist (2017-2018)

Key Research Contributions

Key Publications

Visual Recognition and Understanding

Embodied AI and Robotics

Healthcare AI

Spatial Intelligence

World Labs

Marble: The First Product

Memoir: "The Worlds I See"

Advocacy for Diversity in AI

Awards and Honors

Current Work (2025-2026)

Personal Life

See Also

References

Improve this article

Related Articles

Machine learning terms/Fairness

Computer-use agent

Computer-use model

OCR Models

Pre-training

VLA

Early Life and Immigration

Education

Academic Career at Stanford

Stanford Vision and Learning Lab (SVL)

Stanford Institute for Human-Centered Artificial Intelligence (HAI)

ImageNet and the Deep Learning Revolution

The Insight Behind ImageNet

Building the Dataset: Crowdsourcing at Scale

The ImageNet Large Scale Visual Recognition Challenge

Legacy of ImageNet

Google Cloud Chief Scientist (2017-2018)

Key Research Contributions

Key Publications

Visual Recognition and Understanding

Embodied AI and Robotics

Healthcare AI

Spatial Intelligence

World Labs

Marble: The First Product

Memoir: "The Worlds I See"

Advocacy for Diversity in AI

Awards and Honors

Current Work (2025-2026)

Personal Life

See Also

References

Related Articles

Machine learning terms/Fairness

Computer-use agent

Computer-use model

OCR Models

Pre-training

VLA