Lecture 3 – Optimization Foundations & Ablation Methodology

3.1: Coding in PyTorch — Regularized Linear Regression in PyTorch

Before we dive into deep multilayer perceptrons (piecewise linear regression), we’ll add two essentials while staying in the linear regression setting:

  • Regularization in neural nets: we’ll write a custom loss that explicitly takes the network parameters, adding weight decay to the usual residual term.
  • Working with data splits: we’ll hold out a validation set to monitor training and tune hyperparameters without bias.

Concretely, we’ll minimize MSE(y, ŷ) + λ‖θ‖² on the training split and use the validation split to evaluate generalization in an unbiased way.

Watch on YouTube: https://youtu.be/V0l6b5R6Vkw

3.2: Training MLP I
3.3: Training MLP II
📚 Resources & Lecture Code

Recommended reading: Dive into Deep Learning — D2L: 3.6–3.7; 12; 19.

The Colab notebook contains the lecture code for Module 3 (optimization and training MLPs). Run the cells sequentially as demonstrated in the lecture.