24: Attention & transformers

In this lesson, we wrap up our exploration of the unconditional stable diffusion model. We then implement the unconditional model, train it on fashion MNIST, and discuss the importance of time embedding. We also dive into sine and cosine embeddings, attention mechanisms, self-attention, and multi-headed attention in the context of stable diffusion. We discuss the rearrange function, transformers, and their potential use in vision tasks. Lastly, we create a conditional model by adding a label to the input of the UNet model, allowing it to generate images of a specific class.

Concepts discussed

Implementing an unconditional stable diffusion model
Time embedding and sine/cosine embeddings
Self-attention and multi-headed attention
Rearrange function
Transformers
Creating a conditional stable diffusion model

Video

Discuss this lesson