Vintage Diffusion Finetune

This repository contains the submission code for Tsinghua Machine Learning course: "Finetuning a Stable Diffusion" for style generation.

Getting Started

In this HW, I finetuned stable diffusion in Vintage Artwork Styles dataset . As the convention with Stable Diffusion research, I used v1.4 version for the base model, which can be downloaded from HuggingFace. Place the weights in .\model folder, the final structure should be model/stable-diffusion-v1-4/remaining-model-files. The whole code for running this notebook has been given altogether with the report. But, to reproduce everything, some datasets and model weights need to be downloaded. One can download the 4 folders from this link and place them all in the same directory as the codes to ensure smooth execution. The folders include split for training data (sampled_dataset_with_resized_and_random_cropped_images/), validation data (validation_sampled_dataset_with_resized_and_random_cropped_images/), diffusers source code (diffusers/) and the LoRA weight checkpoints for the stable diffusion (lora_result_vintage/).

Prerequisite

Install diffusers from source with pip install git+https://github.com/huggingface/diffusers

Dataset Explanation

Vintage Artwork Style dataset contains 60k captioned text-to-image photos from the 20th century, consisting of vintage pulp, sci-fi and pinup artworks from the 20th century. It is consists mostly of magazine cover, book cover, and old cartoon. The dataset has short and long captions for each image. The large captions were made with florence-2-large-ft, and then shortened with llama3-8b. One important thing to take note is that the image data is given in url link, rather than the PIL Image like some of the huggingface dataset format. Manual download of the data from the online link is needed. Image resolution is not guaranteed to be 512 x 512 (the resolution of SDv1.4 training image)

Dataset Splitting Phase:

As the file format and image size of the original dataset and SD1.4's required format are not common, the notebook downloaddataset.ipynb is going to handle dataset formating. It does dataset splitting, image download from online source, and reformatting (random cropping and file format change). For detail explanation, please refer to the notebook's markdown.

SD Finetuning LoRA

The finetuning execution, inference, train/validation evaluation result are all included in the diffusion_ft.ipynb. For detail explanation and result, please refer to the notebook's markdown.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Vintage Diffusion Finetune

Getting Started

Prerequisite

Dataset Explanation

Dataset Splitting Phase:

SD Finetuning LoRA

About

Uh oh!

Releases

Packages

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
diffusers		diffusers
lora_result_vintage		lora_result_vintage
sampled_dataset_with_resized_and_random_cropped_images		sampled_dataset_with_resized_and_random_cropped_images
validation_sampled_dataset_with_resized_and_random_cropped_images		validation_sampled_dataset_with_resized_and_random_cropped_images
wandb		wandb
README.md		README.md
diffusion_ft.ipynb		diffusion_ft.ipynb
downloaddataset.ipynb		downloaddataset.ipynb

bconstantine/Vintage-Diffusion-Finetune

Folders and files

Latest commit

History

Repository files navigation

Vintage Diffusion Finetune

Getting Started

Prerequisite

Dataset Explanation

Dataset Splitting Phase:

SD Finetuning LoRA

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages