Before we dive into deep multilayer perceptrons (piecewise linear regression), we’ll add two essentials while staying in the linear regression setting:
Concretely, we’ll minimize MSE(y, ŷ) + λ‖θ‖²
on the training split and use the validation split to evaluate generalization in an unbiased way.
Watch on YouTube: https://youtu.be/V0l6b5R6Vkw
Recommended reading: Dive into Deep Learning — D2L: 3.6–3.7; 12; 19.
The Colab notebook contains the lecture code for Module 3 (optimization and training MLPs). Run the cells sequentially as demonstrated in the lecture.