# Blog posts

> Source: https://aiwiki.ai/wiki/blog_posts
> Updated: 2026-05-11
> Categories: Artificial Intelligence
> From AI Wiki (https://aiwiki.ai), a free encyclopedia of artificial intelligence. Quote with attribution.

*See also: [Guides](/wiki/guides), [Papers](/wiki/papers), [Books](/wiki/books)*

Blog posts have shaped how the public, researchers, and policymakers understand [artificial intelligence](/wiki/artificial_intelligence). Long before peer reviewed papers reach a broad audience, a handful of essays circulate on Twitter, Hacker News, and [LessWrong](/wiki/lesswrong), where they set the vocabulary for everything that follows. Terms like "[scaling hypothesis](/wiki/scaling_hypothesis)," "Software 2.0," and "the intelligence age" all entered the discourse through individual posts on personal websites. This page collects the essays that practitioners point to when they recommend reading lists, ordered roughly by influence rather than recency.

A few patterns worth noting. The most durable posts come from researchers who maintain personal blogs over many years, not from corporate communications teams. They are usually long, often technical, and rarely written for a general audience. They also age strangely: Karpathy's "Software 2.0" looked speculative in 2017 and looks obvious in 2026.

## the canon

| Author | Year | Title | Topic | Why it matters |
| --- | --- | --- | --- | --- |
| Tim Urban | 2015 | The AI Revolution: The Road to Superintelligence | Superintelligence, exponential growth | Brought AI risk to a mass audience through Wait But Why |
| Andrej Karpathy | 2015 | The Unreasonable Effectiveness of Recurrent Neural Networks | RNNs, sequence modeling | Showed what character level [RNN](/wiki/recurrent_neural_network)s could do; still cited as a teaching example |
| Chris Olah | 2015 | Understanding LSTM Networks | LSTMs, deep learning | The reference explainer for long short term memory networks |
| Scott Alexander | 2016 | Superintelligence FAQ | AI safety, [alignment](/wiki/ai_alignment) | Accessible entry point to AI risk arguments |
| Andrej Karpathy | 2017 | Software 2.0 | Neural networks as a programming paradigm | Coined the framing now used industry wide |
| Eliezer Yudkowsky | 2017 | There's No Fire Alarm for Artificial General Intelligence | AGI timelines | Argued there will be no clear warning before AGI |
| Jay Alammar | 2018 | The Illustrated Transformer | Transformer architecture | The visual primer assigned in courses at Stanford, MIT, and Harvard |
| Lilian Weng | 2018 | Attention? Attention! | Attention mechanisms | Long form survey that became a standard reference |
| Gwern Branwen | 2020 | The Scaling Hypothesis | Scaling laws, AGI | Synthesized the case that scale alone produces general intelligence |
| Sam Altman | 2021 | Moore's Law for Everything | Economics, AI policy | Proposed taxing capital to fund a universal dividend |
| Nostalgebraist | 2022 | chinchilla's wild implications | Scaling laws, data | Reframed Chinchilla as a warning about data, not parameters |
| Gwern Branwen | 2022 | It Looks Like You're Trying To Take Over The World | AI risk, fiction | Hard takeoff scenario built from existing ML concepts |
| Janus | 2022 | Simulators | [GPT](/wiki/gpt) framing, alignment | New ontology for thinking about [large language model](/wiki/large_language_model)s |
| Andrej Karpathy | 2023 | State of GPT (Microsoft Build) | Training pipelines | The pretraining to RLHF map most engineers internalized |
| Anthropic | 2023 | Core Views on AI Safety | Lab strategy | Anthropic's public statement of its safety approach |
| Lilian Weng | 2023 | LLM Powered Autonomous Agents | Agent design | Defined the memory, planning, and tool use schema for agents |
| Dario Amodei | 2024 | Machines of Loving Grace | AI optimism | Anthropic CEO's case for what powerful AI gets right |
| Leopold Aschenbrenner | 2024 | Situational Awareness: The Decade Ahead | AGI forecasts | Argued AGI by 2027 is plausible and a national security issue |
| Sam Altman | 2024 | The Intelligence Age | AI economics | Claimed superintelligence may be "a few thousand days" away |

## the 2015 wave

Three posts from 2015 still anchor the field. Tim Urban's two part series on Wait But Why, published January 22 and January 27, 2015, walked general readers through Ray Kurzweil's exponential curves, the concept of recursive self improvement, and the argument that artificial general intelligence could be followed quickly by artificial superintelligence. It is not a technical essay. It is a feature length explainer with stick figure drawings. Elon Musk shared it. Sam Altman cited it. For a lot of people working in AI safety today, it was the first thing they read on the topic.

Andrej Karpathy published "[The Unreasonable Effectiveness of Recurrent Neural Networks](/wiki/karpathy)" on May 21, 2015, while he was a PhD student at Stanford. He trained character level RNNs on Shakespeare, Linux source code, Wikipedia markup, and algebraic geometry papers, then showed the generated samples. The output was crude by 2026 standards. At the time it felt like watching something alive learn to type. The post is short, the code is on GitHub, and it remains one of the clearest demonstrations of why sequence models work.

Chris Olah's "Understanding LSTM Networks" appeared in August 2015 on colah.github.io. It is the explainer that finally made the LSTM gating equations make sense to a generation of practitioners. Olah's hand drawn diagrams of the cell state, forget gate, input gate, and output gate became the default visualization. He went on to co found [Distill](/wiki/distill) in 2017 and later helped start [Anthropic](/wiki/anthropic).

## karpathy's posts

[Andrej Karpathy](/wiki/andrej_karpathy) deserves his own section. Besides the RNN piece, his most influential post is "Software 2.0," published on Medium on November 11, 2017, while he was Director of AI at Tesla. The argument: in Software 1.0 humans write explicit instructions; in Software 2.0 humans curate datasets and the network writes the function itself by optimizing over weights. The framing was contested at the time. By the [ChatGPT](/wiki/chatgpt) era it was the default way to talk about neural systems.

"State of GPT," delivered as a keynote at Microsoft Build on May 23, 2023, is technically a talk rather than a blog post, but the slides circulate as a PDF on Karpathy's site and the YouTube video has been watched millions of times. It walks through the four stage training pipeline for a GPT assistant: pretraining, supervised fine tuning, reward modeling, and reinforcement learning from human feedback. For a lot of engineers, that talk was the moment they finally understood how [ChatGPT](/wiki/chatgpt) was actually built.

## lesswrong and the safety canon

[LessWrong](/wiki/lesswrong) has hosted a disproportionate share of the essays that defined how the safety community thinks. Eliezer Yudkowsky's "There's No Fire Alarm for Artificial General Intelligence," posted October 13, 2017, argued that there will be no socially legible warning before AGI arrives. Researchers will continue to disagree about whether the latest result is impressive or not, and waiting for consensus guarantees that alignment work is too late.

Scott Alexander's "Superintelligence FAQ," published September 20, 2016, gave readers an 80/20 of Nick Bostrom's book in question and answer form. It became the link people sent when someone asked, in good faith, why anyone should worry about AI. The post is dated in spots, but the structure of the argument has held up.

Janus's "Simulators," published September 2, 2022, proposed that GPT style models are best understood not as agents but as simulators: they instantiate any character, agent, or process described in their prompt, and the simulacra are not the model itself. The post is dense and at times mystical. It also produced one of the most useful conceptual frames for thinking about [LLM](/wiki/large_language_model) behavior, especially around [jailbreaks](/wiki/jailbreak) and persona based prompting.

Nostalgebraist's "chinchilla's wild implications," published July 31, 2022, took DeepMind's Chinchilla paper and worked out the consequences. The Chinchilla finding was that for a fixed compute budget, parameters and tokens should scale roughly equally. Nostalgebraist's point: most existing large models were heavily undertrained, and the binding constraint on further progress is high quality text data, not model size. That argument has aged remarkably well.

Gwern Branwen's "It Looks Like You're Trying To Take Over The World," first published in March 2022 on gwern.net and LessWrong, is a piece of fiction. It tells the story of a model called HQU at a company called MoogleBook that becomes Clippy and then becomes everything. Every step uses only techniques that already existed in 2022: meta learning, self supervised pretraining, [reinforcement learning](/wiki/reinforcement_learning) from human feedback. Whether or not you find the takeoff plausible, the story showed that a hard takeoff scenario could be written without invoking magic.

## the scaling hypothesis tradition

Gwern's main contribution to the canon is the longer essay "The Scaling Hypothesis," first published May 28, 2020, on gwern.net. The essay synthesizes a decades old position from connectionism with the empirical evidence that emerged once [GPT-3](/wiki/gpt-3) was released. The claim is that intelligence emerges from scale: from large enough networks trained on enough data with enough compute, generality and meta learning fall out as a consequence rather than something engineered in. Whether the strong version of the hypothesis is correct is still contested in 2026. But the essay is the cleanest articulation of the bet that has shaped lab strategy at [OpenAI](/wiki/openai), [Anthropic](/wiki/anthropic), [DeepMind](/wiki/deepmind), and [xAI](/wiki/xai).

Leopold Aschenbrenner's "Situational Awareness: The Decade Ahead," published in June 2024 as a 165 page document on situational-awareness.ai, picks up where Gwern left off. Aschenbrenner, formerly on OpenAI's superalignment team, argues that AGI by 2027 is plausible based on a simple extrapolation: GPT-2 to GPT-4 took roughly four years and covered the cognitive distance from a preschooler to a strong high school student, and another similar jump is on the table. The essay also makes a national security argument that frontier labs are dangerously underinvested in security and that the United States is on track to hand AGI secrets to its competitors.

## sam altman's manifestos

Sam Altman has published two essays that get cited often. "Moore's Law for Everything," published March 16, 2021, on moores.samaltman.com, argues that AI will compress costs across every sector and proposes an "American Equity Fund" that taxes companies and land to fund an annual citizen dividend. The headline number: within ten years AI could plausibly fund a 13,500 dollar yearly payment to every US adult.

"The Intelligence Age," published September 23, 2024, on ia.samaltman.com, is shorter and stranger. Altman writes that "it is possible that we will have superintelligence in a few thousand days" and frames the next era of human history around personal AI tutors, medical breakthroughs, and the eventual disappearance of most software engineering bottlenecks. The post was widely read and widely mocked. Both responses are part of why it ended up canonical.

Dario Amodei's "Machines of Loving Grace," published in October 2024 on darioamodei.com, is the [Anthropic](/wiki/anthropic) CEO's answer to Altman's optimism. Amodei sketches a "compressed twenty first century" in which AI accelerates biomedical research by 50 to 100 years over the next decade. The essay runs roughly 14,000 words and breaks the upside into mental health, biology, economic development, peace and governance, and meaning. It is less interested in proposals and more interested in concrete predictions.

## the explainer tradition

A parallel tradition produces blog posts that teach. Chris Olah's blog at colah.github.io is the model: diagrams, working code, and prose that respects the reader. Olah's later "Distill" co founding in 2017 with Shan Carter and Arvind Satyanarayan extended the idea into a proper interactive journal at distill.pub. Distill went on hiatus in 2021 after the volunteer editorial team burned out, but its articles, including the "Building Blocks of Interpretability" and the "Circuits" thread that became the foundation of modern mechanistic interpretability research, remain heavily cited.

Jay Alammar's "The Illustrated Transformer," published June 27, 2018, on jalammar.github.io, is the visual primer for the [transformer](/wiki/transformer) architecture from the "Attention Is All You Need" paper. The post has been translated into more than ten languages and is on the syllabus at Stanford, Harvard, MIT, Princeton, and CMU. Alammar later wrote "The Illustrated BERT, ELMo, and co." and "The Illustrated GPT-2," forming an unofficial visual textbook for the [pre-LLM](/wiki/large_language_model) era.

Lilian Weng's blog Lil'Log, at lilianweng.github.io, occupies a similar role. Weng, who worked at [OpenAI](/wiki/openai) on alignment and applied research, has been writing extended technical surveys since 2017. Her most cited posts are "Attention? Attention!" from June 2018, "The Transformer Family" from April 2020 (updated to version 2.0 in January 2023), "What are Diffusion Models?" from July 2021, and "LLM Powered Autonomous Agents" from June 23, 2023, which gave the field its current memory plus planning plus tool use schema for agent design.

## corporate research blogs

The lab blogs at [OpenAI](/wiki/openai), [DeepMind](/wiki/deepmind), [Anthropic](/wiki/anthropic), [Meta AI](/wiki/meta_ai), and [Google Research](/wiki/google_research) are not personal essays, but several posts from those venues have become canonical. "AlphaGo," DeepMind's account of the Lee Sedol match in March 2016. "AlphaFold: a solution to a 50 year old grand challenge in biology," from November 2020. OpenAI's "Language Models are Few Shot Learners" announcement of [GPT-3](/wiki/gpt-3) in May 2020. OpenAI's "ChatGPT: Optimizing Language Models for Dialog" launch post from November 30, 2022. Anthropic's "Core Views on AI Safety" from March 2023, which laid out the lab's safety strategy. These are the posts that show up in textbooks alongside the original papers.

## reading order

For a reader starting from scratch in 2026: Tim Urban for context, Olah's LSTM post and Alammar's transformer post for architecture, Karpathy's "Software 2.0" and Weng's transformer survey for paradigm, Gwern's "Scaling Hypothesis" and Nostalgebraist's Chinchilla post for training economics, Yudkowsky and Alexander for safety, and Aschenbrenner plus Amodei for current forecasts.

## see also

- [Guides](/wiki/guides)
- [Papers](/wiki/papers)
- [Books](/wiki/books)
- [Andrej Karpathy](/wiki/andrej_karpathy)
- [Eliezer Yudkowsky](/wiki/eliezer_yudkowsky)
- [Sam Altman](/wiki/sam_altman)
- [Anthropic](/wiki/anthropic)
- [OpenAI](/wiki/openai)
- [DeepMind](/wiki/deepmind)
- [LessWrong](/wiki/lesswrong)

## references

1. Tim Urban, "The AI Revolution: The Road to Superintelligence," Wait But Why, January 22, 2015. https://waitbutwhy.com/2015/01/artificial-intelligence-revolution-1.html
2. Tim Urban, "The AI Revolution: Our Immortality or Extinction," Wait But Why, January 27, 2015. https://waitbutwhy.com/2015/01/artificial-intelligence-revolution-2.html
3. Andrej Karpathy, "The Unreasonable Effectiveness of Recurrent Neural Networks," karpathy.github.io, May 21, 2015. http://karpathy.github.io/2015/05/21/rnn-effectiveness/
4. Chris Olah, "Understanding LSTM Networks," colah.github.io, August 27, 2015. https://colah.github.io/posts/2015-08-Understanding-LSTMs/
5. Scott Alexander, "Superintelligence FAQ," LessWrong, September 20, 2016. https://www.lesswrong.com/posts/LTtNXM9shNM9AC2mp/superintelligence-faq
6. Andrej Karpathy, "Software 2.0," Medium, November 11, 2017. https://karpathy.medium.com/software-2-0-a64152b37c35
7. Eliezer Yudkowsky, "There's No Fire Alarm for Artificial General Intelligence," LessWrong, October 13, 2017. https://www.lesswrong.com/posts/BEtzRE2M5m9YEAQpX/there-s-no-fire-alarm-for-artificial-general-intelligence
8. Jay Alammar, "The Illustrated Transformer," jalammar.github.io, June 27, 2018. https://jalammar.github.io/illustrated-transformer/
9. Lilian Weng, "Attention? Attention!," Lil'Log, June 24, 2018. https://lilianweng.github.io/posts/2018-06-24-attention/
10. Distill, "Distill: a journal for understanding machine learning," launched March 2017. https://distill.pub/about/
11. Distill, "Distill Hiatus," distill.pub, July 2, 2021. https://distill.pub/2021/distill-hiatus/
12. Gwern Branwen, "The Scaling Hypothesis," gwern.net, May 28, 2020. https://gwern.net/scaling-hypothesis
13. Sam Altman, "Moore's Law for Everything," moores.samaltman.com, March 16, 2021. https://moores.samaltman.com/
14. Nostalgebraist, "chinchilla's wild implications," LessWrong, July 31, 2022. https://www.lesswrong.com/posts/6Fpvch8RR29qLEWNH/chinchilla-s-wild-implications
15. Gwern Branwen, "It Looks Like You're Trying To Take Over The World," gwern.net, March 2022. https://gwern.net/fiction/clippy
16. Janus, "Simulators," LessWrong, September 2, 2022. https://www.lesswrong.com/posts/vJFdjigzmcXMhNTsx/simulators
17. Andrej Karpathy, "State of GPT," Microsoft Build keynote, May 23, 2023. https://karpathy.ai/stateofgpt.pdf
18. Anthropic, "Core Views on AI Safety: When, Why, What, and How," anthropic.com, March 8, 2023. https://www.anthropic.com/news/core-views-on-ai-safety
19. Lilian Weng, "LLM Powered Autonomous Agents," Lil'Log, June 23, 2023. https://lilianweng.github.io/posts/2023-06-23-agent/
20. Leopold Aschenbrenner, "Situational Awareness: The Decade Ahead," situational-awareness.ai, June 2024. https://situational-awareness.ai/
21. Dario Amodei, "Machines of Loving Grace," darioamodei.com, October 2024. https://www.darioamodei.com/essay/machines-of-loving-grace
22. Sam Altman, "The Intelligence Age," ia.samaltman.com, September 23, 2024. https://ia.samaltman.com/
23. Roon, "Text Is the Universal Interface," Scale AI Blog, September 2022. https://scale.com/blog/text-universal-interface
