Skip to content

Post-Training & RLHF

A pretrained model is raw clay. Post-training is what turns it into something you’d use — instruction-following, preference-aligned, and (in the o1/R1 era) actually reasoning. Every lesson here has a runnable Colab.