Skip to content

Reads

A hand-curated reading list — the long-form writing on AI, AGI, and the broader tech moment that has actually shaped how the field thinks. Not a course, not in any order. Pick a section, pick an entry, read for an evening.

Last sweep · 2026-04

Pioneers

The arguments that started everything. Most contemporary AI discourse is downstream of these — knowing the original framing is worth a few evenings.

  • PaperComputing Machinery and IntelligenceAlan Turing · 1950The paper that named the imitation game. Skip the section on ESP — read for the structure of his replies to objections, which is still the template every "can machines think" debate uses today.
  • PaperSteps Toward Artificial IntelligenceMarvin Minsky · 1961A field map drawn before the field existed: search, pattern recognition, learning, planning, induction. Most of the categories survived; the methods didn’t.
  • PaperSome Philosophical Problems from the Standpoint of AIMcCarthy & Hayes · 1969Where the frame problem and situation calculus come from. The technical machinery is dated; the taxonomy of "what does an agent need to know about the world" is not.
  • BookGödel, Escher, Bach: An Eternal Golden BraidDouglas Hofstadter · 1979Seven hundred pages on self-reference, levels of description, and what symbols mean — disguised as dialogues with a tortoise. Either you bounce off it in fifty pages or it permanently rewires how you think about cognition.

The bitter lesson + scaling canon

The empirical case that compute, data, and the right loss function is the dominant story of the last decade. Read these in order.

  • BlogThe Bitter LessonRich Sutton · 2019~1200 words that named the pattern: every time you bake human-domain knowledge into an AI system, more compute eventually wins anyway. Read once a year.
  • PaperScaling Laws for Neural Language ModelsKaplan et al. (OpenAI) · 2020The paper that turned "bigger is better" from vibes into a power-law fit. Superseded in detail by Chinchilla, but historically this is where the scaling-pilled era starts.
  • PaperTraining Compute-Optimal Large Language Models (Chinchilla)Hoffmann et al. (DeepMind) · 2022Corrected Kaplan: for a given compute budget you want roughly equal scaling of params and tokens, not param-heavy. Reshaped every frontier training run after 2022.
  • PaperEmergent Abilities of Large Language ModelsWei et al. (Google) · 2022Famously argued that some capabilities appear sharply at scale. Pair with Schaeffer et al. 2023 ("Are Emergent Abilities a Mirage?") which shows much of it is metric-choice — together they’re the cleanest version of this debate.
  • VideoIntro to Large Language Models (1-hour talk)Andrej Karpathy · 2023The clearest one-hour overview of how LLMs actually work — pretraining, finetuning, RLHF, the OS analogy. If you have one resource to give a smart non-specialist, this is it.
  • PaperGPT-4 Technical ReportOpenAI · 2023Read for the eval table on page 5 and the "predictable scaling" section. The capability claims have been overtaken by 4o and beyond; the framing of model cards as primary literature has not.

AGI, alignment, safety

What could go wrong, and why people who have thought about it for twenty years are worried. Disagree with the conclusions if you like; the arguments are sharp.

  • BookSuperintelligence: Paths, Dangers, StrategiesNick Bostrom · 2014The argument is twelve years old now and some scenarios date badly, but chapters 6–8 are still the cleanest framing of instrumental convergence and the orthogonality thesis.
  • BookHuman Compatible: AI and the Problem of ControlStuart Russell · 2019The most measured book in the safety canon. Russell’s reframing — build systems that are *uncertain* about human preferences — has aged better than the "specify the right utility function" framing it replaced.
  • BlogWhat failure looks likePaul Christiano · 2019The boring-dystopia framing: alignment failure doesn’t look like Skynet, it looks like proxy metrics drifting and humans losing the thread. The most-cited single post on the Alignment Forum for a reason.
  • BlogCore Views on AI SafetyAnthropic · 2023A frontier lab’s actual stated position, in plain prose. Read alongside OpenAI’s and DeepMind’s safety pages to triangulate where the field disagrees — usually on timelines, rarely on the shape of the problem.
  • PaperAn Overview of Catastrophic AI RisksHendrycks, Mazeika, Woodside · 2023The most readable taxonomy of failure modes — malicious use, AI race, organizational risks, rogue AIs. If you only read one safety paper, this is the one with the highest information-per-page.
  • BlogThe Sequences (curated)Eliezer Yudkowsky · 2006–The body of writing that built the rationalist-adjacent AI-risk position. Read selectively — start with "A Human’s Guide to Words" and "Mysterious Answers to Mysterious Questions" — the full corpus is a thousand pages and you don’t need all of it.

The skeptical cases

The dissents that hold up. If you read only the optimist canon you will be blindsided when something breaks.

  • BlogDeep Learning Is Hitting a WallGary Marcus · 2022The skeptic’s case worth taking seriously. Marcus has lost on timelines and on "scaling won’t work," but his point about brittleness on truly out-of-distribution inputs holds in 2026.
  • PaperOn the Measure of IntelligenceFrançois Chollet · 2019The most rigorous attempt to define what we should be measuring. ARC falls out of this paper as the natural test — and frontier models still struggle on the held-out set, which is the point.
  • DocsARC-AGIARC Prize (Chollet, Knoop) · 2024–The benchmark that won’t die. Frontier-model scores went from ~5% (2020) to ~85% (late 2024) — but only after explicit per-task adaptation, which is the part the dashboard hides. Read the leaderboard *and* the rules.
  • BookArtificial Intelligence: A Guide for Thinking HumansMelanie Mitchell · 2019The clearest non-technical history of why each previous AI boom over-promised, and what specifically is different (and what isn’t) this time. Better written than most of the optimist canon.
  • BlogThe Seven Deadly Sins of AI PredictionsRodney Brooks · 2017Read for sin #1 (Amara’s Law — overestimating short term, underestimating long term) and sin #4 (suitcase words). Useful priors before you read any AI forecast — including this list.

Field essays — what’s happening now

Where to follow what is happening now. The frontier moves quarterly; these are the writers who track it well.

  • BlogLil’LogLilian Weng · 2017–The best technical-survey blog in the field. Each post is the explainer you wish someone had written before you read the papers. Start with her posts on attention, RLHF, and agentic LLMs.
  • BlogSimon Willison’s Weblog (LLMs tag)Simon Willison · 2002–The single best running journal of "what new LLM capability shipped this week and what it actually does." Light on theory, heavy on hands-on. The LLMs tag filters the rest of his (still excellent) Django-and-Datasette content.
  • BlogAhead of AISebastian Raschka · 2022–The monthly "what mattered in research this month" digest, with code-level depth. Pairs well with Weng — Raschka is broader and more frequent, Weng is deeper and slower.
  • BlogHazy ResearchHazy Research (Stanford) · 2020–Tri Dao and the FlashAttention crowd. If you want to understand why a kernel is fast, this is where the explanations are written by the people who wrote the kernel.
  • BlogDistill (archive)Distill · 2016–2021Dormant since 2021 but the archive is the gold standard for visual ML explanations — "Building Blocks of Interpretability," "Feature Visualization," "A Visual Exploration of Gaussian Processes." Read these instead of the equivalent papers.
  • BlogDwarkesh Podcast (transcripts)Dwarkesh Patel · 2022–The interview show that got Karpathy, Sutskever, Dario Amodei, and Demis Hassabis to talk for three hours each. Read the transcripts — they’re primary sources for what frontier-lab leaders actually believe.

Adjacent canon

Not about AI specifically, but the engineering and research wisdom the AI canon is built on top of.

  • BookThe Mythical Man-MonthFred Brooks · 1975Fifty years old and still the most-cited book on why software projects miss schedules. The chapter on the second-system effect predicts most ML-platform rewrites you will witness.
  • PaperComputer Science as Empirical Inquiry: Symbols and SearchNewell & Simon · 1976The Turing-Award lecture that argued AI is an empirical science with falsifiable hypotheses — Physical Symbol System and Heuristic Search. The symbol-system camp lost the deep-learning era; the methodological argument they made still stands.
  • BlogYou and Your ResearchRichard Hamming · 1986A talk by a Bell Labs mathematician on what separates people who do important work from people who don’t. The "what are the important problems in your field, and why aren’t you working on them?" question is the highest-leverage thing in this whole list.