1 min read
Exploring Karpathy's microgpt, Interactively

Andrej Karpathy just released microgpt, a complete GPT implementation in ~155 lines of pure Python. Training and inference, no PyTorch, no TensorFlow, just math, random, and os. I built this interactive explainer to walk through each piece.

Andrej Karpathy

microgpt.py

Complete GPT in ~200 lines · 27 tokens · 4,192 params

1 / 6

Overview

microgpt.py by Andrej Karpathy is a complete GPT — training and inference — in ~200 lines of pure Python. No PyTorch, no TensorFlow, no dependencies.

“This file is the complete algorithm. Everything else is just efficiency.”

FILE MAP — tap any section to explore
Data LoadingL1-21
TokenizerL23-27
Autograd EngineL29-72
ParametersL74-90
Model (GPT)L92-144
Training LoopL146-184
InferenceL186-200
WHY THIS MATTERS
Zero dependenciesOnly import math, random, os.
4,192 parametersSame architecture as GPT-4's 1.8T, just smaller.
Real trainingActually learns to generate plausible new names.
Full backpropHand-rolled autograd, every gradient from first principles.