Fei-Fei Li (born 1976) is a Chinese-American computer scientist, the inaugural Sequoia Professor of Computer Science at Stanford University, and co-director of the Stanford Institute for Human-Centered Artificial Intelligence (HAI). She is best known for creating ImageNet, the large-scale visual database that catalyzed the deep learning revolution in computer vision during the 2010s. Often referred to as the "Godmother of AI," Li has shaped the trajectory of artificial intelligence through her research, institution-building, public advocacy, and entrepreneurial ventures. In 2024, she co-founded World Labs, a startup focused on spatial intelligence that reached a valuation of roughly $5 billion by early 2026.
Fei-Fei Li was born in Beijing, China, in 1976 and spent much of her childhood in Chengdu, the capital of Sichuan province [1]. When she was twelve, her father emigrated to Parsippany, New Jersey, to seek better opportunities for the family. Four years later, in 1992, sixteen-year-old Li and her mother followed him to the United States, arriving with limited English and very little money [2].
The transition was difficult. Li worked in restaurants and at her parents' dry-cleaning business after school to help the family stay afloat, all while learning English and keeping up with her coursework at Parsippany High School [2]. Despite these challenges, she excelled academically. Her teachers recognized her aptitude in mathematics and science, and she graduated from Parsippany High School in 1995. Her immigrant experience would later become a central theme in her memoir and her advocacy for making AI more inclusive.
Li earned a Bachelor of Arts in physics from Princeton University in 1999 [3]. At Princeton, she developed an interest in the computational aspects of perception and cognition, which would guide the rest of her career. She received a Paul and Daisy Soros Fellowship for New Americans, a scholarship supporting immigrants and children of immigrants pursuing graduate education in the United States [4].
In 2000, Li began doctoral studies at the California Institute of Technology (Caltech), working at the intersection of computer science, electrical engineering, and cognitive neuroscience. Her PhD research explored how the human visual system processes and categorizes scenes, bridging computational models with biological insights. She received her PhD in electrical engineering from Caltech in 2005 [1].
| Detail | Information |
|---|---|
| Full Name | Fei-Fei Li |
| Born | 1976, Beijing, China |
| Nationality | Chinese-American |
| Undergraduate | Princeton University (B.A. Physics, 1999) |
| Doctorate | California Institute of Technology (Ph.D. Electrical Engineering, 2005) |
| Institution | Stanford University |
| Known For | ImageNet, Stanford HAI, World Labs |
| Notable Role | Google Cloud Chief Scientist of AI/ML (2017-2018) |
| Google Scholar Citations | ~340,000+ |
| Publications | 400+ scientific articles |
After completing her PhD, Li held positions at the University of Illinois at Urbana-Champaign and Princeton University before joining Stanford University in 2009 as an assistant professor [1]. She rose through the ranks to become the Sequoia Professor of Computer Science, one of the most prestigious endowed chairs in the department.
At Stanford, Li directs the Stanford Vision and Learning Lab (SVL), which has produced influential research in image recognition, object detection, visual reasoning, and embodied AI. She also served as the director of the Stanford Artificial Intelligence Lab (SAIL) from 2013 to 2018, one of the oldest and most respected AI research labs in the world [5].
The Stanford Vision and Learning Lab, under Li's direction, has been one of the most productive computer vision research groups in the world. The lab has contributed to a wide range of research areas:
| Research Area | Key Contributions | Notable Projects |
|---|---|---|
| Image classification | Foundational work on large-scale visual recognition | ImageNet, visual categorization models |
| Object detection | Real-time object localization and identification | Faster detection architectures |
| Image captioning | Generating natural language descriptions of images | DenseCap, Visual Genome |
| Visual question answering | Answering questions about image content | VQA models and benchmarks |
| Video understanding | Temporal reasoning in video sequences | Activity recognition, action detection |
| Embodied AI | Intelligent agents in simulated environments | BEHAVIOR benchmark |
| 3D scene understanding | Spatial reasoning and scene reconstruction | 3D scene graphs |
The lab's work on the Visual Genome dataset, published in 2017, created a structured knowledge base connecting objects, attributes, and relationships within images. This dataset has been widely used for training models that go beyond simple classification to reason about the relationships between objects in a scene [5].
In 2019, Li co-founded the Stanford Institute for Human-Centered Artificial Intelligence (HAI) alongside philosopher John Etchemendy. The institute promotes a vision of AI development that places human well-being at the center of technological progress. HAI conducts interdisciplinary research, publishes an annual AI Index report tracking global AI trends, and engages with policymakers on AI governance [5]. Li serves as a founding co-director and has used the platform to advocate for thoughtful regulation and broad participation in shaping AI's future.
The HAI AI Index report has become one of the most widely cited sources of data on AI trends, tracking metrics such as research publication volume, AI investment, technical performance benchmarks, and policy developments across countries.
Li's most transformative contribution to the field is ImageNet, a massive visual database she conceived in 2006 and formally launched in 2009. The project arose from her conviction that the field of computer vision was being held back not by insufficient algorithms but by a lack of high-quality, large-scale training data [6].
In the mid-2000s, the standard benchmark datasets in computer vision contained only a few thousand images spread across a handful of categories. Li believed this was fundamentally inadequate. She drew inspiration from cognitive science research showing that children learn to recognize objects by being exposed to millions of examples in varied contexts over the course of development. If human visual intelligence required massive exposure to visual data, she reasoned, machines would need the same [6].
At the time, this was a contrarian view. Most computer vision researchers focused on developing better algorithms and evaluated them on small datasets like Caltech-101 (which contained about 9,000 images across 101 categories). The idea that simply providing more data would drive progress was met with skepticism from many in the field.
ImageNet organized over 14 million images into more than 20,000 categories based on the WordNet lexical hierarchy. The scale of the labeling task was staggering, and Li's team developed an innovative approach using Amazon Mechanical Turk (MTurk) to crowdsource annotations.
| ImageNet Construction Detail | Specification |
|---|---|
| Total images | 14+ million |
| Categories (synsets) | 20,000+ |
| Annotation method | Amazon Mechanical Turk crowdsourcing |
| Workers involved | ~49,000 from 167 countries |
| Candidate images filtered | 160+ million |
| Labels per image | 3 (for quality assurance) |
| Ontology source | WordNet lexical database |
| Time to initial launch | ~2.5 years (2006-2009) |
The crowdsourcing process was carefully designed to ensure quality. Workers ("Turkers") were shown candidate images and asked to decide whether each image represented a given word from the WordNet ontology. Multiple workers labeled each image, and statistical methods were used to resolve disagreements and filter out low-quality annotations. Li's team implemented several quality control measures, including gold-standard images with known correct labels that were interspersed among the labeling tasks to identify unreliable workers [6].
The effort was enormous. Between July 2008 (when the project had zero labeled images) and December 2008, the team categorized three million images across more than 6,000 synsets. By the time of the formal launch in 2009, ImageNet contained 3.2 million labeled images. The dataset continued to grow in subsequent years, eventually reaching over 14 million images [6].
The budget for the project was modest by academic standards, with much of the annotation cost kept low through the Mechanical Turk platform. Li later described the decision to use crowdsourcing as born partly of necessity: traditional methods of labeling by graduate students would have taken decades.
Starting in 2010, Li organized the ImageNet Large Scale Visual Recognition Challenge (ILSVRC), an annual competition that tasked researchers with building the best image classifiers. The challenge became a defining benchmark in AI. In 2012, a team led by Alex Krizhevsky, Ilya Sutskever, and Geoffrey Hinton entered AlexNet, a convolutional neural network that slashed the error rate nearly in half compared to previous methods [7]. This result is widely considered the moment that deep learning proved its superiority for visual tasks, triggering a wave of investment and research that transformed the entire AI field.
The ripple effects of ImageNet extended far beyond image classification. The dataset and its associated challenge demonstrated the power of combining large datasets with deep neural networks, a paradigm that now underlies advances in natural language processing, speech recognition, robotics, and many other domains. Li's insistence on scale and data quality proved prescient, and ImageNet remains one of the most cited datasets in AI research.
| Year | ImageNet Milestone |
|---|---|
| 2006 | Fei-Fei Li conceives the ImageNet project |
| 2009 | ImageNet formally launched with 3.2 million labeled images |
| 2010 | First ILSVRC competition held |
| 2012 | AlexNet wins ILSVRC, sparking the deep learning revolution |
| 2014 | GoogLeNet and VGGNet push error rates below 7% |
| 2015 | ResNet achieves superhuman performance (3.57% error rate) |
| 2017 | Final ILSVRC competition held; task considered largely solved |
During a sabbatical from Stanford, Li served as Vice President at Google and Chief Scientist of AI/ML at Google Cloud from January 2017 to September 2018 [8]. In this role, she led efforts to democratize AI tools for enterprise customers, making machine learning capabilities more accessible to businesses without deep technical expertise. She oversaw the development of Google Cloud's AutoML products, which allow users to train custom machine learning models with minimal coding.
Her tenure at Google also highlighted the tensions that arise when academic ideals meet corporate imperatives. Li was involved in internal discussions about Google's AI ethics and public responsibility, experiences that reinforced her commitment to human-centered approaches to AI development [1].
Beyond ImageNet, Li's research portfolio spans several major areas of AI.
Li is one of the most cited computer scientists of her generation, with over 340,000 citations on Google Scholar and more than 400 published scientific articles. Her most influential publications include:
| Publication | Year | Key Contribution | Citations |
|---|---|---|---|
| "ImageNet: A Large-Scale Hierarchical Image Database" (with J. Deng et al.) | 2009 | Introduced the ImageNet dataset and benchmark | 40,000+ |
| "One-Shot Learning of Object Categories" (with R. Fergus, P. Perona) | 2006 | Pioneered learning from very few examples | 4,000+ |
| "ImageNet Large Scale Visual Recognition Challenge" (with O. Russakovsky et al.) | 2015 | Defined the ILSVRC benchmark and surveyed progress | 20,000+ |
| "Visualizing and Understanding Recurrent Networks" (with A. Karpathy) | 2015 | Analyzed how recurrent networks process sequential data | 3,000+ |
| "DenseCap: Fully Convolutional Localization Networks for Dense Captioning" (with J. Johnson et al.) | 2016 | Dense image captioning at region level | 1,500+ |
| "Visual Genome: Connecting Language and Vision Using Crowdsourced Dense Image Annotations" (with R. Krishna et al.) | 2017 | Structured visual knowledge base | 5,000+ |
Li's lab has produced foundational work on image classification, object detection, image captioning, and visual question answering. Her group contributed to the development of models that can describe the content of an image in natural language, bridging computer vision and natural language processing [5].
More recently, Li's research has expanded into embodied AI, the idea that intelligent systems must interact with the physical world to develop genuine understanding. Her Stanford lab developed the BEHAVIOR benchmark, which evaluates AI agents' ability to plan and execute household tasks in simulated environments, covering up to 1,000 different activities [9]. The People, AI & Robots (PAIR) group within her lab focuses on generalizable robot perception and control.
Li has also pursued applications of computer vision in healthcare through the Partnership in AI-Assisted Care (PAC), a collaboration between Stanford's School of Medicine and Computer Science department. This work uses computer vision and machine learning to monitor patients in clinical settings, detect falls, and support clinicians with real-time information. The system uses ambient sensors rather than wearable devices, preserving patient dignity while providing continuous monitoring [5].
Beginning in 2024, Li articulated a new research direction she calls "spatial intelligence," which concerns AI systems that can perceive, generate, reason within, and interact with three-dimensional environments. This concept underpins her startup World Labs and represents what she sees as the next major frontier after the breakthroughs in language and 2D vision [10].
Li has described spatial intelligence as the bridge between the 2D understanding that current AI systems excel at and the 3D physical world that humans navigate effortlessly. In her framing, language models understand words, vision models understand images, but spatial intelligence enables understanding of environments, volumes, physics, and embodied interaction.
In 2024, Li co-founded World Labs alongside Justin Johnson, Christoph Lassner, and Ben Mildenhall [11]. The company's mission is to build AI systems with spatial intelligence: the capacity to understand, generate, and interact with 3D worlds.
World Labs develops what it calls Large World Models (LWMs), a new class of AI systems that go beyond the text-based large language models (LLMs) that have dominated recent AI progress. While LLMs process and generate language, LWMs are designed to perceive and reason about three-dimensional space, enabling applications in architecture, urban planning, gaming, film production, and scientific simulation [11].
In November 2025, World Labs launched its first commercial product, Marble, which generates 3D virtual worlds from image or text prompts. Marble allows users to create explorable three-dimensional scenes from a single photograph or a text description, representing a practical application of the spatial intelligence research Li has championed [12].
The company attracted significant investor interest. In February 2026, World Labs raised $1 billion in a new funding round, signaling a major shift in venture capital attention from text-based AI to spatial and 3D AI systems. Reports from January 2026 placed the company's valuation at approximately $5 billion [12].
| Detail | World Labs |
|---|---|
| Founded | 2024 |
| Co-founders | Fei-Fei Li, Justin Johnson, Christoph Lassner, Ben Mildenhall |
| Focus | Spatial intelligence, Large World Models |
| First product | Marble (launched November 2025) |
| Funding (Feb 2026) | $1 billion raised |
| Estimated Valuation | ~$5 billion (Jan 2026) |
| Key investors | Andreessen Horowitz, Radical Ventures, and others |
In November 2023, Li published her memoir, The Worlds I See: Curiosity, Exploration and Discovery at the Dawn of AI, with Macmillan Publishers [13]. The book traces her journey from a teenage immigrant struggling to learn English in New Jersey to one of the most influential figures in artificial intelligence. It recounts the creation of ImageNet, her time at Google, and her growing conviction that AI must be developed with humanity's broader interests in mind.
The memoir received praise for its candid portrayal of the personal sacrifices behind scientific achievement and for making complex technical concepts accessible to a general audience. NPR, Kirkus Reviews, and other outlets highlighted the book's thoughtful exploration of AI ethics and the immigrant experience in American science [14].
Throughout her career, Li has been a vocal proponent of diversity and inclusion in technology. In 2017, she co-founded AI4ALL, a nonprofit organization dedicated to increasing diversity in the AI field, particularly among women, underrepresented minorities, and students from low-income backgrounds [15]. AI4ALL runs summer programs, supports AI education in underserved communities, and works to ensure that the people building AI systems reflect the diversity of the people who will be affected by them.
Li has spoken publicly about how her own experience as an immigrant woman in a male-dominated field shaped her understanding of the biases that can creep into technology when its creators lack diverse perspectives. She has testified before the U.S. Congress on the importance of inclusive AI development and has engaged with policymakers around the world [1].
Li has received extensive recognition for her contributions to computer science and AI. The following table lists selected major awards:
| Year | Award or Honor | Awarding Organization |
|---|---|---|
| 2006 | Microsoft Research New Faculty Fellowship | Microsoft |
| 2009 | NSF CAREER Award | National Science Foundation |
| 2011 | Alfred Sloan Faculty Award | Sloan Foundation |
| 2016 | IAPR J.K. Aggarwal Prize | International Association for Pattern Recognition |
| 2016 | IEEE PAMI Mark Everingham Award | IEEE |
| 2018 | ACM Fellow | Association for Computing Machinery |
| 2019 | IEEE PAMI Longuet-Higgins Prize | IEEE |
| 2019 | National Geographic Society Further Award | National Geographic |
| 2020 | Elected to National Academy of Engineering | NAE |
| 2020 | Elected to National Academy of Medicine | NAM |
| 2021 | Elected to American Academy of Arts and Sciences | AAAS |
| 2022 | IEEE PAMI Thomas Huang Memorial Prize | IEEE |
| 2023 | Intel Lifetime Achievements Award | Intel |
| 2024 | VinFuture Grand Prize | VinFuture Foundation |
| 2025 | Queen Elizabeth Prize for Engineering | QEPrize Foundation |
| 2025 | Time "Architects of AI" (Person of the Year) | Time Magazine |
| 2025 | Yale Honorary Degree (Doctor of Engineering & Technology) | Yale University |
The Queen Elizabeth Prize for Engineering, awarded in 2025, recognized Li alongside six other innovators for "groundbreaking contributions to modern machine learning." The group shares the 500,000-pound prize. Li was presented the award by King Charles III at a ceremony where her contributions through ImageNet were specifically cited as instrumental to the deep learning revolution [16].
As of early 2026, Li continues to hold her position as Sequoia Professor at Stanford and co-director of HAI, though she has been on partial academic leave since January 2024 to focus on World Labs [1]. Her current work is centered on advancing spatial intelligence, both through World Labs' commercial products and through her academic research at Stanford.
At Stanford, her lab continues to work on embodied AI, healthcare applications, and the BEHAVIOR benchmark for evaluating AI agents in simulated environments. Through HAI, she remains engaged in AI policy discussions, including questions about AI governance, safety, and the societal impact of increasingly capable AI systems.
World Labs, following its $1 billion funding round in February 2026, is scaling its team and developing commercial applications of Large World Models. Li has described spatial intelligence as the necessary next step for AI to move from understanding text and flat images to comprehending and interacting with the physical world as humans do [10].
Li is married to Silvio Savarese, a computer scientist who also works on computer vision and AI. They have children together. Li has spoken about the challenges of balancing a demanding academic and entrepreneurial career with family life, a theme she explores in her memoir [13].