Deep Learning - Shakeri Lab

Lecture 8 – Attention Mechanism (Part 1)

Lecture 8.2 – Attention Mechanism (Part 2)

Lecture 8.3 – Implementing Attention in seq2seq Decoder

📚 Resources & Live Coding

Recommended reading: Dive into Deep Learning — D2L: Chapter 11 up to 11.5.

Homework 4: Add cross-attention to your prior GRU-based seq2seq model so that the decoder can attend over all encoder hidden states at each decoding step (same translation task). Report and compare accuracy/bleu vs your previous best.

Seq2Seq Cross-Attention Diagram preview:

Open Live Coding Colab

Module 7

Module 9