10: Diving Deeper

This lesson creates a complete Diffusers pipeline from the underlying components: the VAE, unet, scheduler, and tokeniser. By putting them together manually, this gives you the flexibility to fully customise every aspect of the inference process.

We also discuss three important new papers that have been released in the last week, which improve inference performance by over 10x, and allow any photo to be “edited” by just describing what the new picture should show.

In the second half of the lesson Jeremy begins the “from scratch” implementation of Stable Diffusion. He introduces the “miniai” library which will be created by students during the course, and discusses organising and simplifying code. The lesson discusses the Python data model, tensors, and random number generation. Jeremy introduces the Wickman-Hill random number generation algorithm and compares the performance of custom and Pytorch’s built-in random number generators. The lesson concludes with creating a linear classifier using a tensor.

Concepts discussed

  • Papers:
    • Progressive Distillation for Fast Sampling of Diffusion Models
    • On Distillation of Guided Diffusion Models
    • Imagic
  • Tokenizing input text
  • CLIP encoder for embeddings
  • Scheduler for noise determination
  • Organizing and simplifying code
  • Negative prompts and callbacks
  • Iterators and generators in Python
  • Custom class for matrices
  • Dunder methods
  • Python data model
  • Tensors
  • Pseudo-random number generation
    • Wickman-Hill algorithm
    • Random state in deep learning
  • Linear classifier using a tensor

Video

Lesson resources