12.1Multimodal Learning

Vision-language alignment, contrastive objectives (e.g., CLIP), and fusion strategies for building multimodal systems.

Slide preview

Suggested reading

12.2Diffusion Models

Noise schedules, forward/reverse processes, and sampling recipes that power modern generative diffusion pipelines.

Slide preview

Suggested reading

12.3Variational Autoencoders (VAE)

Latent variable modeling with encoder/decoder pairs, ELBO optimization, and practical VAE architectures.

Slide preview

Suggested reading

12.4Generative Adversarial Networks (GAN)

Adversarial training, loss variants, and practical tips for stabilizing GANs for image generation.

Slide preview

Suggested reading

End of course