Deep Learning
Resources
Setup
FAQ
Shakeri Lab
School of Data Science • UVA
10.1 Vision Transformer (ViT)
10.2 Pretrained Transformer Models
10.3 Scaling of Decoder Transformer Models
📚 Resources, Readings & Colab
Recommended reading:
Dive into Deep Learning
— D2L:
Chapter 11
.
Optional papers (arXiv):
Vision Transformer (ViT)
;
DeiT
BERT
;
T5
;
GPT-3
;
DeepSeek-V2
Kaplan et al. (Scaling Laws)
;
Chinchilla: Training Compute-Optimal LLMs
DeiT Playground (Colab)
Hugging Face Transformer Tutorial (Colab)
Homework 5: Transformer 2.0 (Colab)
Module 9
Module 11