Skip to content

Commit

Permalink
add README.md
Browse files Browse the repository at this point in the history
  • Loading branch information
metamath1 committed Jan 10, 2025
1 parent f2ab62b commit 09024b0
Show file tree
Hide file tree
Showing 2 changed files with 109 additions and 0 deletions.
102 changes: 102 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,102 @@
# Stable Diffusion nano

## Overview
Stable Diffusion nano is a simplified implementation of latent diffusion models inspired by the short course [How Diffusion Models Work](https://learn.deeplearning.ai/courses/diffusion-models/) by deeplearning.ai. The repository aims to make the concepts and structure of diffusion models, particularly Figure 3 from the paper [High-Resolution Image Synthesis with Latent Diffusion Model](https://arxiv.org/abs/2112.10752), accessible to beginners.

![Figure 3 from the LDM Paper](https://raw.githubusercontent.com/metamath1/stable-diffusion-nano/main/assets/ldm-figure3.png)

The repository implements models proposed in two key papers:
- [Denoising Diffusion Probabilistic Models](https://arxiv.org/abs/2006.11239)
- [High-Resolution Image Synthesis with Latent Diffusion Model](https://arxiv.org/abs/2112.10752)

### Why This Repository?
The official implementations of the above papers are often too complex for beginners. Stable Diffusion nano simplifies these concepts and presents them in an intuitive, beginner-friendly manner using Jupyter notebooks. Our goal is to provide a hands-on learning experience by focusing on essential components while avoiding unnecessary complexity.

For those interested in deeper theoretical insights, refer to [A Gentle Introduction to Diffusion Model: Part 1 - DDPM](https://metamath1.github.io/blog/posts/diffusion/ddpm_part1.html).

---

## Notebooks

This repository includes the following notebooks:

### 1. `01.ddpm.ipynb`
- **Description**: Implements the basic Denoising Diffusion Probabilistic Model (DDPM).
- **Goal**: Understand the fundamental process of adding and removing noise to generate images from random noise.

### 2. `02.vae_latent_2d.ipynb`
- **Description**: Implements an image encoder-decoder (VAE) for converting pixel-space images into latent-space representations, a crucial step for Latent Diffusion Models.
- **Goal**: Learn how to compress and reconstruct images using a Variational Autoencoder (VAE).

### 3. `03.ldm_nano.ipynb`
- **Description**: Simplifies the structure in Figure 3 of the LDM paper to create a basic latent diffusion model.
- **Goal**: Implement a complete but simplified Latent Diffusion Model while maintaining the essential architecture and principles.

---

## Visual Representations

| **Concatenate** | **Multi Head Attention** |
|-----------------------------------------------------------|----------------------------------------------------------|
| ![CONCAT VAE](assets/LDM_CONCAT_VAE_4/ani_CONCAT_w5.gif) | ![QKV VAE](assets/LDM_QKV_VAE_4/ani_QKV_w5.gif) |

---

## Dataset
We use a custom dataset of 16x16 image sprites prepared from:
- [FrootsnVeggies](https://zrghr.itch.io/froots-and-veggies-culinary-pixels)
- [kyrise](https://kyrise.itch.io/)

This dataset was utilized in the course [How Diffusion Models Work](https://learn.deeplearning.ai/courses/diffusion-models/). The small resolution ensures faster training and inference, making it suitable for educational purposes.

---

## Getting Started
### Prerequisites
- Python 3.8 or higher
- Jupyter Notebook
- PyTorch
- torchvision
- numpy
- matplotlib
- plotly

### Installation
1. Clone the repository:
```bash
git clone https://github.com/yourusername/stable-diffusion-nano.git
cd stable-diffusion-nano
```
2. Install dependencies:
```bash
pip install -r requirements.txt
```

### Running the Notebooks
Open any notebook in Jupyter and run the cells sequentially. Start with `01.ddpm.ipynb` for the basics and progress to `03.ldm_nano.ipynb` for the complete model. You can also run the notebooks on Google Colab for free. Simply upload the desired notebook to Colab and ensure the necessary dependencies are installed.

#### Hands-On Notebook

| Chapter | Colab |
|----------------------------------|----------------------------------------------------------------------|
| DDPM Notebook | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/metamath1/stable-diffusion-nano/blob/main/01.ddpm.ipynb) |
| VAE Latent 2D Notebook | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/metamath1/stable-diffusion-nano/blob/main/02.vae_latent_2d.ipynb) |
| LDM Nano Notebook | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/metamath1/stable-diffusion-nano/blob/main/03.ldm_nano.ipynb) |

---

## Contributions
Contributions are welcome! If you find any issues or want to add new features, feel free to open an issue or submit a pull request.

---

## License
This project is licensed under the MIT License. See the `LICENSE` file for details.

---

## Acknowledgements
- [How Diffusion Models Work](https://learn.deeplearning.ai/courses/diffusion-models/) by deeplearning.ai for the inspiration.
- Authors of [Denoising Diffusion Probabilistic Models](https://arxiv.org/abs/2006.11239) and [High-Resolution Image Synthesis with Latent Diffusion Model](https://arxiv.org/abs/2112.10752) for their groundbreaking work.
- [FrootsnVeggies](https://zrghr.itch.io/froots-and-veggies-culinary-pixels) and [kyrise](https://kyrise.itch.io/) for the dataset.

7 changes: 7 additions & 0 deletions requirements.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
torch==2.4.1
torchvision==0.19.1
numpy==1.26.4
matplotlib==3.9.2
jupyterlab==4.2.5
pandas==2.1.4
plotly==5.24.1

0 comments on commit 09024b0

Please sign in to comment.