This repository contains the code for my course project on improving HIFI-GAN. We introduced a model that is an efficient modification of the original work from this paper, which also delivers better quality.
- Python >= 3.6
- Clone this repository.
- Install python requirements. Please refer requirements.txt
- Download and extract the LJ Speech dataset.
And move all wav files to
LJSpeech-1.1/wavs
python train.py --config config_4m.json --checkpoint_path --num_disc <amount of discriminators> --factor <factor to divide disc's channels amount>
To train different model versions, just provide the corresponding config.json from the repo
python inference.py --checkpoint_file [generator checkpoint file path]
Generated audios will be saved in generated_files
One can provide custom save directiry through --output_dir
.