Videos
Last reviewed
May 11, 2026
Sources
No citations yet
Review status
Needs citations
Revision
v2 · 2,478 words
Improve this article
Add missing citations, update stale details, or suggest a clearer explanation.
Last reviewed
May 11, 2026
Sources
No citations yet
Review status
Needs citations
Revision
v2 · 2,478 words
Add missing citations, update stale details, or suggest a clearer explanation.
See also: Guides
This page collects influential video content about artificial intelligence, machine learning, and deep learning. The first section surveys the YouTube channels that have shaped how people learn AI online. The second section is a chronological table of individual videos that have been widely cited or recommended within the field.
YouTube is where most people learn AI outside formal coursework. A self-taught practitioner in 2026 is more likely to have watched Andrej Karpathy code a GPT from scratch than to have read a textbook chapter. The channels below are the ones working researchers and instructors point newcomers toward.
Andrej Karpathy was a founding member of OpenAI and a former director of AI at Tesla. He started uploading lectures to YouTube in August 2022 under the title "Neural Networks: Zero to Hero," which has become one of the most cited self-study resources for building neural networks from the ground up. The playlist runs through micrograd (a tiny autograd engine), makemore (character-level language models), and culminates in nanoGPT, where he reimplements a small transformer.
The single most famous entry is "Let's build GPT: from scratch, in code, spelled out," posted in January 2023. In under two hours, Karpathy implements a decoder-only transformer in PyTorch, following "Attention Is All You Need" and connecting it to the architecture behind GPT-3 and ChatGPT. ML educators routinely call it the clearest hands-on introduction to transformer mechanics. He expanded the series in 2024 with a tokenizer lecture and a long video reproducing GPT-2 124M.
The accompanying GitHub repo (karpathy/nn-zero-to-hero) holds Jupyter notebooks for every lecture and is regularly used as the basis for university coursework.
Grant Sanderson runs 3Blue1Brown, a math education channel built on his own animation library, Manim. His four-part series on neural networks, which started with "But what is a Neural Network?" in October 2017, is the canonical visual explanation of how a feedforward network learns. It covers gradient descent, backpropagation, and the calculus underneath, with animations that have been borrowed by countless other educators.
In 2024 he extended the series with chapters on transformers, attention, and large language models. Together they form one of the most watched explanations of how GPT-style models work. The 3Blue1Brown style favors intuition before formulas, which is why beginners are often pointed at it before they touch any code.
Yannic Kilcher, a Swiss computer scientist who completed his PhD at ETH Zurich, runs the most prolific paper-explainer channel in machine learning. Each video walks through a recent research paper section by section, with annotations drawn live on the page, and ends with Kilcher's own opinion of the contribution. His take is often skeptical, which is part of the appeal: he is willing to argue that a popular paper overclaims, or that a result will not replicate.
He is best known for breakdowns of foundational papers like "Attention Is All You Need," the GPT-3 paper, and AlphaFold, alongside a steady stream of weekly news roundups ("ML News") covering releases, controversies, and industry moves. For researchers who want a fast read of what a new paper is actually claiming, Kilcher's channel functions as a kind of public peer review.
Two Minute Papers is hosted by Karoly Zsolnai-Feher, a computer graphics researcher associated with TU Wien in Austria. The channel launched in 2016 and now has over a thousand videos. Each one is a short summary (typically four to six minutes) of a single research result, usually a flashy graphics, vision, or generative AI paper. Viewers come for a quick look at what just got published, not for an implementation walkthrough.
The catchphrase "What a time to be alive" has become a meme. The channel is one of the few that covers both generative AI and traditional graphics research, including fluid simulation and neural rendering.
Josh Starmer holds a PhD in genetics and started StatQuest in 2016 while working as a bioinformatician at the University of North Carolina. The channel breaks statistics and ML techniques into very small steps, with hand-drawn slides, his signature jingle, and a focus on intuition over notation. Topics include linear regression, PCA, random forests, gradient boosting, neural networks, and transformers.
It is widely recommended for learners who find textbook math intimidating. Starmer has also published "The StatQuest Illustrated Guide to Machine Learning" and "The StatQuest Illustrated Guide to Neural Networks and AI," both expanding on the video material.
Lex Fridman is a research scientist affiliated with MIT, where he taught the deep learning course 6.S094 (Deep Learning for Self-Driving Cars) starting in 2017. He is now better known for the Lex Fridman Podcast, a long-form interview show that has hosted nearly every major figure in AI: Sam Altman, Demis Hassabis, Ilya Sutskever, Yann LeCun, Geoffrey Hinton, Andrej Karpathy, Yoshua Bengio, Mustafa Suleyman, and others.
Fridman's interview style is patient and earnest, which fans appreciate and critics dislike. The conversations regularly run three to four hours. The back catalog has become a primary record of how AI researchers talk about their own work outside of papers.
Computerphile is the computer science companion to Numberphile, both produced by Australian-British filmmaker Brady Haran. The channel launched in 2014 with most production handled by Sean Riley, and most interviews are filmed with academics at the University of Nottingham. Computerphile covers more than AI, but its ML videos, often featuring Mike Pound and Rob Miles, are some of the most polished short-form explanations on the platform.
Rob Miles in particular has become well known for talking about AI safety and alignment in ways non-specialists can follow. His videos on the orthogonality thesis, instrumental convergence, and specification gaming are frequently cited as starting points for the safety literature.
Stephen Welch's Welch Labs channel released its first "Imaginary Numbers Are Real" video in August 2015, working through complex numbers as an extension of the real number line. That series stayed in print as a recommended math viewing for years before Welch shifted toward AI topics. More recently he has produced detailed visual essays on neural network training dynamics, the manifold hypothesis, and the geometry of embeddings, which sit somewhere between the slow patience of 3Blue1Brown and the technical depth of Yannic Kilcher.
Sebastian Raschka is an ML researcher and former statistics professor at the University of Wisconsin-Madison, now a research engineer at Lightning AI. His YouTube channel hosts lecture recordings from STAT 451 (Introduction to Machine Learning) and STAT 453 (Introduction to Deep Learning and Generative Models), both available end-to-end with full slides and code on GitHub.
He wrote "Build a Large Language Model (From Scratch)" (Manning, 2024), and the companion repo LLMs-from-scratch is a popular reference for engineers learning the internals of a GPT-style model in PyTorch. His newsletter Ahead of AI complements the videos with paper roundups.
A few more channels show up repeatedly in researcher recommendations. DeepLearningAI, run by Andrew Ng, hosts companion material for the Coursera Deep Learning Specialization. MIT 6.S191, taught by Alexander Amini and Ava Amini, has been refreshed annually since 2018 with full lecture videos posted within days. AI Explained, run by a creator known as Philip, leans on benchmark deep dives and capability analysis. Sam Witteveen and 1littlecoder cover practical LLM tooling, agents, and API tricks. Machine Learning Street Talk, hosted by Tim Scarfe with Yannic Kilcher and Keith Duggar, runs longer technical conversations with researchers.
| Channel | Host | Started | Format | Typical length | Best for |
|---|---|---|---|---|---|
| Andrej Karpathy | Andrej Karpathy | 2022 | Hands-on code-along lectures | 1 to 4 hours | Building GPT-style models from scratch |
| 3Blue1Brown | Grant Sanderson | 2015 | Animated math explanations | 10 to 30 min | Visual intuition for neural networks and transformers |
| Yannic Kilcher | Yannic Kilcher | 2017 | Paper walkthroughs and ML news | 30 to 90 min | Keeping current with new research papers |
| Two Minute Papers | Karoly Zsolnai-Feher | 2016 | Short paper highlights | 4 to 7 min | Quick survey of flashy results |
| StatQuest | Josh Starmer | 2016 | Hand-drawn step-by-step lessons | 10 to 30 min | Beginners learning statistics and ML basics |
| Lex Fridman | Lex Fridman | 2018 | Long-form interviews | 2 to 4 hours | Hearing AI leaders speak in depth |
| Computerphile | Brady Haran (prod.) | 2014 | Academic interviews | 8 to 20 min | Concise CS explainers and AI safety |
| Welch Labs | Stephen Welch | 2014 | Visual essays | 15 to 60 min | Geometry of learning and embeddings |
| Sebastian Raschka | Sebastian Raschka | 2020 | Full university lectures | 20 to 90 min | Structured ML and LLM coursework |
| MIT 6.S191 | Alexander Amini, Ava Amini | 2018 | University lecture course | 1 to 1.5 hours | Annual deep learning overview |
Dates reflect when each channel started posting AI or ML content regularly, which is not always the same as when the YouTube account was created.
If you want to actually implement something, Karpathy's Zero to Hero and Raschka's LLMs-from-scratch are the closest things to a structured curriculum, and they assume you will pause the video to type code. If you want to follow the field as it moves, Yannic Kilcher and Two Minute Papers cover new papers within days, with Kilcher going deep and Two Minute Papers staying surface-level. If you have never trained a model and the math is the part that loses you, 3Blue1Brown and StatQuest are the standard recommendations; both can be watched on a phone without taking notes.
Most ML engineers do not stick to one channel. A common pattern: read the abstract of a new paper on arXiv, watch Kilcher's take to decide whether it is worth the time, then dig into the paper itself or watch Karpathy or Raschka if there is an implementation worth studying.
The following table lists notable individual videos cited in earlier versions of this article. Star ratings reflect community votes from the original wiki and are kept here for historical reference rather than as a current quality ranking.
| Creator | Date | Video Name | Rating |
|---|---|---|---|
| Eliezer Yudkowsky | 2007 | Introducing the Singularity: Three Major Schools of Thought | ★★★★ |
| Eliezer Yudkowsky | 2017 | Difficulties of Artificial General Intelligence Alignment | ★★★★ |
| Eric Elliott | 2020 | What it's like to be a Computer: An Interview with GPT-3 | ★★★★ |
| Ethan Caballero | 2022 | Broken Neural Scaling Laws | ★★★ |
| Yannic Kilcher | 2017 | Attention is all you need | ★★★ |
| OpenAI | 2019 | Multi Agent Hide and Seek | ★★★ |
| WSJ | 2019 | How China is Using Artificial Intelligence in Classrooms | ★★★ |
| Yannic Kilcher | 2020 | GPT-3: Language Models are Few Shot Learners paper explained | ★★★ |
| Eliezer Yudkowsky | 2012 | Intelligence Explosion | ★★ |
| OpenAI | 2017 | Dota 2 | ★★ |
| Fei-Fei Li | 2018 | How to make AI that's good for people | ★★ |
| George Hotz | 2019 | Jailbreaking the Simulation with George Hotz | ★★ |
| Lex Fridman | 2019 | Deep Learning Basics: Introduction and Overview | ★★ |
| Sam Altman | 2020 | Sam Altman talks about AI at Big Compute 20 Tech Conference | ★★ |
| Two Minute Papers | 2020 | OpenAI GPT-3, Good at Almost Everything | ★★ |
| Jack Soslow | 2020 | Two AIs talking to each other | ★★ |
| Lex Fridman | 2020 | GPT-3 vs Human Brain | ★★ |
| Harrison Kinsley and Daniel Kukiela | 2020 | Neural Networks from Scratch, p.3 The Dot Product | ★★ |
| Lex Fridman | 2020 | Deep Learning State of the Art | ★★ |
| Bycloud | 2020 | What Happens When AI Robots Design Themselves | ★★ |
| Two Minute Papers | 2020 | MuZero: DeepMind's New AI Mastered More Than 50 Games | ★★ |
| Mira Murati | 2020 | Ensuring AGI benefits all of humanity | ★★ |
| Two Minute Papers | 2020 | Can an AI Design Our Tax Policy? | ★★ |
| Neat AI | 2021 | Lenia, Conway's game of life arrives in the 21st century | ★★ |
| Sam Altman | 2021 | The Future of AI Research from DALL-E to GPT-3 | ★★ |
| Yannic Kilcher | 2021 | How far can we scale up? Deep Learning's Diminishing Returns (Article Review) | ★★ |
| Two Minute Papers | 2021 | Watch Tesla's Self-Driving Car Learning in a Simulation | ★★ |
| Ethan Caballero | 2022 | Scale is all you need | ★★ |
| Two Minute Papers | 2022 | DeepMind AlphaFold A Gift to Humanity | ★★ |
| OpenAI | 2022 | Aligning AI Systems with Human Intent | ★★ |
| NeatAI | 2022 | AI Learns NEAT Pacman Solution | ★★ |
| Yannic Kilcher | 2022 | Grokking: Generalization beyond Overfitting on small algorithmic datasets (Paper Explained) | ★★ |
| The Third Build | 2022 | How an AI is Becoming the World's Best Pokemon Player | ★★ |
| Andrej Karpathy | 2023 | Let's build GPT: from scratch, in code, spelled out | ★★★★★ |
| Andrej Karpathy | 2024 | Let's reproduce GPT-2 (124M) | ★★★★ |
| 3Blue1Brown | 2024 | But what is a GPT? Visual intro to transformers | ★★★★ |