Before we dive into deep multilayer perceptrons (piecewise linear regression), we’ll add two essentials while staying in the linear regression setting:
Concretely, we’ll minimize MSE(y, ŷ) + λ‖θ‖²
on the training split and use the validation split to evaluate generalization in an unbiased way.
Watch on YouTube: https://youtu.be/V0l6b5R6Vkw
Recommended reading: Dive into Deep Learning — through the end of Chapter 6.
Homework 3 explores optimization algorithms, ablation studies, and systematic hyperparameter tuning. Complete the exercises in the provided notebook.