Andrej Karpathy just released microgpt, a complete GPT implementation in ~155 lines of pure Python. Training and inference, no PyTorch, no TensorFlow, just math, random, and os. I built this interactive explainer to walk through each piece.

Andrej Karpathy

microgpt.py

Complete GPT in ~200 lines · 27 tokens · 4,192 params

1 / 6

Overview

microgpt.py by Andrej Karpathy is a complete GPT — training and inference — in ~200 lines of pure Python. No PyTorch, no TensorFlow, no dependencies.

“This file is the complete algorithm. Everything else is just efficiency.”

FILE MAP — tap any section to explore

Data LoadingL1-21

TokenizerL23-27

Autograd EngineL29-72

ParametersL74-90

Model (GPT)L92-144

Training LoopL146-184

InferenceL186-200

WHY THIS MATTERS

Zero dependencies — Only import math, random, os.

4,192 parameters — Same architecture as GPT-4's 1.8T, just smaller.

Real training — Actually learns to generate plausible new names.

Full backprop — Hand-rolled autograd, every gradient from first principles.