Fei-Fei Li
Last reviewed
May 18, 2026
Sources
No citations yet
Review status
Needs citations
Revision
v3 · 4,397 words
Improve this article
Add missing citations, update stale details, or suggest a clearer explanation.
Last reviewed
May 18, 2026
Sources
No citations yet
Review status
Needs citations
Revision
v3 · 4,397 words
Add missing citations, update stale details, or suggest a clearer explanation.
Fei-Fei Li (born 1976) is a Chinese-American computer scientist, the inaugural Sequoia Capital Professor of Computer Science at Stanford University, and co-director of the Stanford Institute for Human-Centered Artificial Intelligence (HAI). She is best known for creating ImageNet, the large-scale visual database that catalyzed the deep learning revolution in computer vision during the 2010s. Often referred to as the "Godmother of AI," Li has shaped the trajectory of artificial intelligence through her research, institution-building, public advocacy, and entrepreneurial ventures. In 2024 she co-founded World Labs, a startup focused on spatial intelligence that raised an additional $1 billion in February 2026 at a reported valuation near $5 billion[^1]. In December 2025 Time magazine named her among the "Architects of AI," its 2025 Person of the Year[^2].
Fei-Fei Li was born in Beijing, China, in 1976 and spent much of her childhood in Chengdu, the capital of Sichuan province[^3]. When she was twelve, her father emigrated to Parsippany, New Jersey, to seek better opportunities for the family. Four years later, in 1992, sixteen-year-old Li and her mother followed him to the United States, arriving with limited English and very little money[^4].
The transition was difficult. Li worked in restaurants and at her parents' dry-cleaning business after school to help the family stay afloat, all while learning English and keeping up with her coursework at Parsippany High School[^4]. Despite these challenges, she excelled academically. Her teachers recognized her aptitude in mathematics and science, and she graduated from Parsippany High School in 1995. Her immigrant experience would later become a central theme in her memoir and her advocacy for making AI more inclusive.
Li earned a Bachelor of Arts in physics from Princeton University in 1999[^5]. At Princeton, she developed an interest in the computational aspects of perception and cognition, which would guide the rest of her career. She received a Paul and Daisy Soros Fellowship for New Americans, a scholarship supporting immigrants and children of immigrants pursuing graduate education in the United States[^6], and her graduate studies at Caltech were additionally supported by a National Science Foundation Graduate Research Fellowship[^3].
In 2000, Li began doctoral studies at the California Institute of Technology (Caltech), working at the intersection of computer science, electrical engineering, and cognitive neuroscience. Her PhD research, supervised by Pietro Perona and Christof Koch, explored how the human visual system processes and categorizes scenes, bridging computational models with biological insights. She received her PhD in electrical engineering from Caltech in 2005[^3].
| Detail | Information |
|---|---|
| Full Name | Fei-Fei Li |
| Born | 1976, Beijing, China |
| Nationality | Chinese-American |
| Undergraduate | Princeton University (B.A. Physics, 1999) |
| Doctorate | California Institute of Technology (Ph.D. Electrical Engineering, 2005) |
| Institution | Stanford University |
| Known For | ImageNet, Stanford HAI, World Labs, spatial intelligence |
| Notable Role | Google Cloud Chief Scientist of AI/ML (2017-2018) |
| Google Scholar Citations | ~340,000+ |
| Publications | 400+ scientific articles |
After completing her PhD, Li held positions at the University of Illinois at Urbana-Champaign (2007-2009) and Princeton University before joining Stanford University in 2009 as an assistant professor[^3]. She was promoted to associate professor with tenure in 2012 and to full professor in 2018[^3], and she now holds the inaugural Sequoia Capital Professorship of Computer Science.
At Stanford, Li directs the Stanford Vision and Learning Lab (SVL), which has produced influential research in image recognition, object detection, visual reasoning, and embodied AI. She also served as the director of the Stanford Artificial Intelligence Lab (SAIL) from 2013 to 2018, one of the oldest and most respected AI research labs in the world[^7].
The Stanford Vision and Learning Lab, under Li's direction, has been one of the most productive computer vision research groups in the world. The lab has contributed to a wide range of research areas:
| Research Area | Key Contributions | Notable Projects |
|---|---|---|
| Image classification | Foundational work on large-scale visual recognition | ImageNet, visual categorization models |
| Object detection | Real-time object localization and identification | Faster detection architectures |
| Image captioning | Generating natural language descriptions of images | DenseCap, Visual Genome |
| Visual question answering | Answering questions about image content | VQA models and benchmarks |
| Video understanding | Temporal reasoning in video sequences | Activity recognition, action detection |
| Embodied AI | Intelligent agents in simulated environments | BEHAVIOR benchmark |
| 3D scene understanding | Spatial reasoning and scene reconstruction | 3D scene graphs |
The lab's work on the Visual Genome dataset, published in 2017, created a structured knowledge base connecting objects, attributes, and relationships within images. This dataset has been widely used for training models that go beyond simple classification to reason about the relationships between objects in a scene[^7].
In 2019, Li co-founded the Stanford Institute for Human-Centered Artificial Intelligence (HAI) alongside philosopher John Etchemendy. The institute promotes a vision of AI development that places human well-being at the center of technological progress. HAI conducts interdisciplinary research, publishes an annual AI Index report tracking global AI trends, and engages with policymakers on AI governance[^7]. Li serves as a founding co-director and has used the platform to advocate for thoughtful regulation and broad participation in shaping AI's future.
The HAI AI Index report has become one of the most widely cited sources of data on AI trends, tracking metrics such as research publication volume, AI investment, technical performance benchmarks, and policy developments across countries.
Li's most transformative contribution to the field is ImageNet, a massive visual database she conceived in 2006 and formally launched in 2009. The project arose from her conviction that the field of computer vision was being held back not by insufficient algorithms but by a lack of high-quality, large-scale training data[^8].
In the mid-2000s, the standard benchmark datasets in computer vision contained only a few thousand images spread across a handful of categories. Li believed this was fundamentally inadequate. She drew inspiration from cognitive science research showing that children learn to recognize objects by being exposed to millions of examples in varied contexts over the course of development. If human visual intelligence required massive exposure to visual data, she reasoned, machines would need the same[^8].
At the time, this was a contrarian view. Most computer vision researchers focused on developing better algorithms and evaluated them on small datasets like Caltech-101 (which contained about 9,000 images across 101 categories). The idea that simply providing more data would drive progress was met with skepticism from many in the field.
ImageNet organized over 14 million images into more than 20,000 categories based on the WordNet lexical hierarchy. The scale of the labeling task was staggering, and Li's team developed an innovative approach using Amazon Mechanical Turk (MTurk) to crowdsource annotations.
| ImageNet Construction Detail | Specification |
|---|---|
| Total images | 14+ million |
| Categories (synsets) | 20,000+ |
| Annotation method | Amazon Mechanical Turk crowdsourcing |
| Workers involved | ~49,000 from 167 countries |
| Candidate images filtered | 160+ million |
| Labels per image | 3 (for quality assurance) |
| Ontology source | WordNet lexical database |
| Time to initial launch | ~2.5 years (2006-2009) |
The crowdsourcing process was carefully designed to ensure quality. Workers ("Turkers") were shown candidate images and asked to decide whether each image represented a given word from the WordNet ontology. Multiple workers labeled each image, and statistical methods were used to resolve disagreements and filter out low-quality annotations. Li's team implemented several quality control measures, including gold-standard images with known correct labels that were interspersed among the labeling tasks to identify unreliable workers[^8].
The effort was enormous. Between July 2008 (when the project had zero labeled images) and December 2008, the team categorized three million images across more than 6,000 synsets. By the time of the formal launch in 2009, ImageNet contained 3.2 million labeled images. The dataset continued to grow in subsequent years, eventually reaching over 14 million images[^8].
The budget for the project was modest by academic standards, with much of the annotation cost kept low through the Mechanical Turk platform. Li later described the decision to use crowdsourcing as born partly of necessity: traditional methods of labeling by graduate students would have taken decades.
Starting in 2010, Li organized the ImageNet Large Scale Visual Recognition Challenge (ILSVRC), an annual competition that tasked researchers with building the best image classifiers. The challenge became a defining benchmark in AI. In 2012, a team led by Alex Krizhevsky, Ilya Sutskever, and Geoffrey Hinton entered AlexNet, a convolutional neural network that slashed the error rate nearly in half compared to previous methods[^9]. This result, sometimes called "the ImageNet moment," is widely considered the moment that deep learning proved its superiority for visual tasks, triggering a wave of investment and research that transformed the entire AI field.
The ripple effects of ImageNet extended far beyond image classification. The dataset and its associated challenge demonstrated the power of combining large datasets with deep neural networks, a paradigm that now underlies advances in natural language processing, speech recognition, robotics, and many other domains. Li's insistence on scale and data quality proved prescient, and ImageNet remains one of the most cited datasets in AI research.
| Year | ImageNet Milestone |
|---|---|
| 2006 | Fei-Fei Li conceives the ImageNet project |
| 2009 | ImageNet formally launched with 3.2 million labeled images |
| 2010 | First ILSVRC competition held |
| 2012 | AlexNet wins ILSVRC, sparking the deep learning revolution |
| 2014 | GoogLeNet and VGGNet push error rates below 7% |
| 2015 | ResNet achieves superhuman performance (3.57% error rate) |
| 2017 | Final ILSVRC competition held; task considered largely solved |
During a sabbatical from Stanford, Li served as Vice President at Google and Chief Scientist of AI/ML at Google Cloud from January 2017 to September 2018[^10]. In this role, she led efforts to democratize AI tools for enterprise customers, making machine learning capabilities more accessible to businesses without deep technical expertise. She oversaw the development of Google Cloud's AutoML products, which allow users to train custom machine learning models with minimal coding.
Her tenure at Google also highlighted the tensions that arise when academic ideals meet corporate imperatives. Li was involved in internal discussions about Google's AI ethics and public responsibility, particularly around Project Maven, the Pentagon's AI-assisted drone-imagery program that triggered employee protests. These experiences reinforced her commitment to human-centered approaches to AI development[^3].
Beyond ImageNet, Li's research portfolio spans several major areas of AI.
Li is one of the most cited computer scientists of her generation, with over 340,000 citations on Google Scholar and more than 400 published scientific articles. Her most influential publications include:
| Publication | Year | Key Contribution | Citations |
|---|---|---|---|
| "ImageNet: A Large-Scale Hierarchical Image Database" (with J. Deng et al.) | 2009 | Introduced the ImageNet dataset and benchmark | 40,000+ |
| "One-Shot Learning of Object Categories" (with R. Fergus, P. Perona) | 2006 | Pioneered learning from very few examples | 4,000+ |
| "ImageNet Large Scale Visual Recognition Challenge" (with O. Russakovsky et al.) | 2015 | Defined the ILSVRC benchmark and surveyed progress | 20,000+ |
| "Visualizing and Understanding Recurrent Networks" (with A. Karpathy) | 2015 | Analyzed how recurrent networks process sequential data | 3,000+ |
| "DenseCap: Fully Convolutional Localization Networks for Dense Captioning" (with J. Johnson et al.) | 2016 | Dense image captioning at region level | 1,500+ |
| "Visual Genome: Connecting Language and Vision Using Crowdsourced Dense Image Annotations" (with R. Krishna et al.) | 2017 | Structured visual knowledge base | 5,000+ |
Li's lab has produced foundational work on image classification, object detection, image captioning, and visual question answering. Her group contributed to the development of models that can describe the content of an image in natural language, bridging computer vision and natural language processing[^7].
More recently, Li's research has expanded into embodied AI, the idea that intelligent systems must interact with the physical world to develop genuine understanding. Her Stanford lab developed the BEHAVIOR benchmark, which evaluates AI agents' ability to plan and execute household tasks in simulated environments, covering up to 1,000 different activities[^11]. The People, AI & Robots (PAIR) group within her lab focuses on generalizable robot perception and control.
Li has also pursued applications of computer vision in healthcare through the Partnership in AI-Assisted Care (PAC), a collaboration between Stanford's School of Medicine and Computer Science department. This work uses computer vision and machine learning to monitor patients in clinical settings, detect falls, and support clinicians with real-time information. The system uses ambient sensors rather than wearable devices, preserving patient dignity while providing continuous monitoring[^7].
Beginning in 2024, Li articulated a new research direction she calls "spatial intelligence," which concerns AI systems that can perceive, generate, reason within, and interact with three-dimensional environments. This concept underpins her startup World Labs and represents what she sees as the next major frontier after the breakthroughs in language and 2D vision[^12].
Li has described spatial intelligence as the bridge between the 2D understanding that current AI systems excel at and the 3D physical world that humans navigate effortlessly. In her framing, language models understand words, vision models understand images, but spatial intelligence enables understanding of environments, volumes, physics, and embodied interaction. She elaborated on this thesis in a November 2025 manifesto, "From Words to Worlds," published on her Substack[^12], and presented the concept directly to the United Nations Security Council in December 2024[^13].
In 2024, Li co-founded World Labs alongside Justin Johnson (her former PhD student, now at the University of Michigan), Christoph Lassner (formerly Meta Reality Labs), and Ben Mildenhall (co-creator of Neural Radiance Fields, NeRF)[^14]. The company emerged from stealth on September 13, 2024 with $230 million in funding from Andreessen Horowitz, NEA, Radical Ventures, and NVIDIA's venture arm NVentures, valuing the company at roughly $1 billion[^14].
World Labs develops what it calls Large World Models (LWMs), a new class of AI systems that go beyond the text-based large language models (LLMs) that have dominated recent AI progress. While LLMs process and generate language, LWMs are designed to perceive and reason about three-dimensional space, enabling applications in architecture, urban planning, gaming, film production, robotics simulation, and scientific simulation[^14].
On November 12, 2025, World Labs launched its first commercial product, Marble, a multimodal world model that generates persistent, downloadable 3D environments from text prompts, photos, videos, panoramas, or coarse 3D layouts[^15]. Unlike streaming world models that synthesize scenes on the fly, Marble produces stable scenes that can be exported as Gaussian splats, polygon meshes, or videos and imported into engines such as Unity or Unreal[^15].
| Marble Feature | Description |
|---|---|
| Inputs | Text, single/multi-image, video, panorama, 3D layout |
| Outputs | Gaussian splats, meshes, video, Vision Pro and Quest 3 VR |
| Chisel editor | Hybrid 3D tool: users block out spatial structure, AI fills visual style |
| World expansion | Extend a generated world with additional detail |
| Composer mode | Combine multiple worlds into larger scenes |
| Pricing tiers | Free (4 generations), Standard $20/mo, Pro $35/mo, Max $95/mo |
Marble launched alongside Spark (an API for developers) and is positioned for use in game environments, visual effects, VR experiences, and robotics training simulation, competing with peers such as Google DeepMind's Genie 3, Odyssey, and Decart[^15].
On February 18, 2026, World Labs announced an additional $1 billion in funding led by new strategic investors, with Bloomberg reporting an implied valuation near $5 billion - a fivefold jump from its September 2024 mark[^1][^16]. The round included Autodesk ($200 million as both investor and strategic adviser), AMD, Emerson Collective, Fidelity Management & Research Company, NVIDIA, Sea, and continued participation from existing backers including Andreessen Horowitz, Greylock, and Radical Ventures[^17][^16]. Cisco also invested in 2025 ahead of the larger round[^17]. World Labs stated the funds would accelerate development of world models for "storytelling, creativity, robotics, scientific discovery, and beyond"[^17].
| Detail | World Labs |
|---|---|
| Founded | 2024 (out of stealth September 13, 2024) |
| Co-founders | Fei-Fei Li, Justin Johnson, Christoph Lassner, Ben Mildenhall |
| Focus | Spatial intelligence, Large World Models |
| First product | Marble (launched November 12, 2025) |
| Developer products | Marble API, Spark |
| Stealth funding (Sep 2024) | $230 million (~$1B valuation) |
| Series round (Feb 2026) | $1 billion (~$5B reported valuation) |
| Cumulative funding | ~$1.23 billion |
| Key investors | a16z, NEA, Radical Ventures, NVIDIA, Autodesk, AMD, Cisco, Fidelity, Emerson Collective, Sea, Greylock |
In November 2023, Li published her memoir, The Worlds I See: Curiosity, Exploration and Discovery at the Dawn of AI, with Flatiron Books (Macmillan)[^18]. The book traces her journey from a teenage immigrant struggling to learn English in New Jersey to one of the most influential figures in artificial intelligence. It recounts the creation of ImageNet, her time at Google, and her growing conviction that AI must be developed with humanity's broader interests in mind.
The memoir received praise for its candid portrayal of the personal sacrifices behind scientific achievement and for making complex technical concepts accessible to a general audience. NPR, Kirkus Reviews, and other outlets highlighted the book's thoughtful exploration of AI ethics and the immigrant experience in American science[^19].
In 2017, Li co-founded AI4ALL with Olga Russakovsky and Rick Sommer, a nonprofit dedicated to increasing diversity in the AI field, particularly among women, underrepresented minorities, and students from low-income backgrounds[^20]. AI4ALL grew out of an earlier Stanford program (SAILORS) and runs summer programs, supports AI education in underserved communities, and works to ensure that the people building AI systems reflect the diversity of the people who will be affected by them.
Li has spoken publicly about how her own experience as an immigrant woman in a male-dominated field shaped her understanding of the biases that can creep into technology when its creators lack diverse perspectives. She has testified before the U.S. Congress on the importance of inclusive AI development, including a 2018 appearance before the House Subcommittees on Research and Technology and Energy at a hearing titled "Artificial Intelligence: With Great Power Comes Great Responsibility"[^21].
In September 2024, following his veto of California's SB 1047 AI safety bill, Governor Gavin Newsom asked Li to co-lead a state policy review with Jennifer Tour Chayes (dean of UC Berkeley's College of Computing, Data Science, and Society) and Mariano-Florentino Cuéllar (president of the Carnegie Endowment for International Peace). The group released a 41-page interim report on March 19, 2025[^22] and its final report, "The California Report on Frontier AI Policy," on June 17, 2025[^23]. The report recommended public-facing transparency requirements, safe-harbor protections for third-party AI evaluators, expanded whistleblower protections for AI lab employees, and a "trust but verify" framework. It cautioned against using developer-level thresholds (such as employee headcount) and argued that training-compute thresholds remain "the most attractive option" for regulators[^23].
On December 19, 2024, at a high-level Security Council meeting chaired by U.S. Secretary of State Antony Blinken on "Maintenance of International Peace and Security and Artificial Intelligence," Li briefed members via videoconference alongside UN Secretary-General Antonio Guterres and Meta Chief AI Scientist Yann LeCun[^13]. Li used the platform to introduce spatial intelligence to international policymakers and to call for public-sector leadership, global collaboration, and evidence-based policymaking. She is also a member of the UN Secretary-General's Scientific Advisory Board[^13].
Li has received extensive recognition for her contributions to computer science and AI. The following table lists selected major awards:
| Year | Award or Honor | Awarding Organization |
|---|---|---|
| 2006 | Microsoft Research New Faculty Fellowship | Microsoft |
| 2009 | NSF CAREER Award | National Science Foundation |
| 2011 | Alfred P. Sloan Research Fellowship | Sloan Foundation |
| 2016 | IAPR J.K. Aggarwal Prize | International Association for Pattern Recognition |
| 2016 | IEEE PAMI Mark Everingham Award | IEEE |
| 2018 | ACM Fellow | Association for Computing Machinery |
| 2019 | IEEE PAMI Longuet-Higgins Prize | IEEE |
| 2019 | National Geographic Society Further Award | National Geographic |
| 2020 | Elected to National Academy of Engineering | NAE |
| 2020 | Elected to National Academy of Medicine | NAM |
| 2021 | Elected to American Academy of Arts and Sciences | AAAS |
| 2022 | IEEE PAMI Thomas Huang Memorial Prize | IEEE |
| 2022 | Schmidt Sciences AI2050 Senior Fellow | Schmidt Sciences |
| 2023 | Intel Lifetime Achievement Award | Intel |
| 2024 | VinFuture Grand Prize | VinFuture Foundation |
| 2025 | Queen Elizabeth Prize for Engineering | QEPrize Foundation |
| 2025 | Time "Architects of AI" (Person of the Year) | Time Magazine |
| 2025 | Yale Honorary Degree (Doctor of Engineering & Technology) | Yale University |
The Queen Elizabeth Prize for Engineering, announced in February 2025 and presented at St James's Palace in November 2025 by King Charles III, recognized Li alongside six other pioneers - Yoshua Bengio, Bill Dally, Geoffrey Hinton, John Hopfield, Jensen Huang, and Yann LeCun - for "groundbreaking contributions to modern machine learning"[^24][^25]. The seven laureates share the GBP 500,000 prize. The QEPrize citation specifically credited Li with "establishing the importance of providing high quality datasets" through ImageNet[^25].
In December 2025, Time magazine named eight tech leaders as the "Architects of AI," its 2025 Person of the Year, recognizing Li alongside Sam Altman, Dario Amodei, Demis Hassabis, Jensen Huang, Elon Musk, Lisa Su, and Mark Zuckerberg for "the year when artificial intelligence's full potential roared into view"[^2]. Li was singled out as the cohort's leading academic and advocate for human-centered AI[^2].
As of mid-2026, Li continues to hold her position as Sequoia Capital Professor at Stanford and co-director of HAI, though she has been on partial academic leave since January 2024 to focus on World Labs[^3]. Her current work is centered on advancing spatial intelligence, both through World Labs' commercial products and through her academic research at Stanford.
At Stanford, her lab continues to work on embodied AI, healthcare applications, and the BEHAVIOR benchmark for evaluating AI agents in simulated environments. Through HAI, she remains engaged in AI policy discussions, including questions about AI governance, safety, and the societal impact of increasingly capable AI systems.
World Labs, following its $1 billion funding round in February 2026, is scaling its team, expanding Marble's capabilities, and developing its developer API (Spark). Li has described spatial intelligence as the necessary next step for AI to move from understanding text and flat images to comprehending and interacting with the physical world as humans do[^12].
Li is married to Silvio Savarese, a computer scientist who also works on computer vision and AI and serves as Chief Scientist at Salesforce AI Research. They have two children. Li has spoken about the challenges of balancing a demanding academic and entrepreneurial career with family life, a theme she explores in her memoir[^18].