Pinned We're ecstatic to bring you "How Transformer LLMs Work" -- a free course with ~90 minutes of video, code, and crisp visuals and animations that explain the modern Transformer architecture, tokenizers, embeddings, and mixture-of-expert models. @MaartenGr and I have developed a
How GPT3 works. A visual thread. A trained language model generates text. We can optionally pass it some text as input, which influences its output. The output is generated from what the model "learned" during its training period where it scanned vast amounts of text. 1/n
The Illustrated Stable Diffusionjalammar.github.io/illustrated-st…New post! Over 30 visuals explaining how Stable Diffusion works (diffusion, latent diffusion, CLIP, and a lot more).
The Illustrated DeepSeek-R1 Spent the weekend reading the paper and sorting through the intuitions. Here's a visual guide and the main intuitions to understand the model and the process that created it. Link in the first reply. All feedback welcome.
pip install scikit-learn It's easy to take for granted, but this single command gives you functionality I'd value at hundreds of thousands of dollars, if not more. Not to mention amazing documentation that beautifully weaves guides and references. Hats off to @scikit_learn
The Illustrated Guide to AI Agents New book announcement! Thrilled that together with @MaartenGr , we're writing a new book titled “An Illustrated Guide to AI Agents” and published by @OReillyMedia . This will be our most visually-rich project yet. We will drill into the main
The wait is over! Our book, Hands On Large Language Models is now available! You can access the digital and Kindle versions today! The print version is in the presses as we speak and can be on their way to you in a couple of weeks. In this visual introduction to the modern LLM,
The Illustrated NeurIPS 2025: A Visual Map of the AI Frontier New blog post! NeurIPS 2025 papers are out—and it’s a lot to take in. This visualization lets you explore the entire research landscape interactively, with clusters, summaries, and @cohere LLM-generated explanations
Truly excited to see @piesauce 's book go out into the world. I've had the pleasure of speaking with Suhas for hours and hours over the last couple of years and debating where LLMs are and where they would go. Certainly check it out! It's full of hard-won knowledge.
A 🧵looking at DeepMind's Retro Transformer, which at 7.5B parameters is on par with GPT3 and models 25X its size in knowledge-intensive tasks. A big moment for Large Language Models (LLMs) for reasons I'll mention in this thread.deepmind.com/research/publi…
The Illustrated GPT-OSS New post! A visual tour of the architecture, message formatting, and reasoning of the latest GPT. Link in 🧵