Mechanical Interpretability
How GPT learns layer by layer