Fine tuning losses #89

snakch · 2021-04-09T08:27:21Z

Hello, Thank you for making this code base open-source, it's great!

I'm having the following issue: I'm fine-tuning the ffhq model on my own dataset. Since I'm training on colab, I have to do this piecewise, so I end up training as long as possible, then restarting from the latest snapshot.

The problem is that when I look at the losses, they seem to start from scratch every time. I includea screnshot of losses for two subsequent runs. I call train.py with the following arguments (other than the snapshot and data paths)

--aupipe=bg --gamma=10 --cfg=paper256 --mirror=1 --snap=10 --metrics=none

Is this normal would you say? What's then the best way of getting a sense of progress (other than manually inspecting outputs)? Thanks!

The text was updated successfully, but these errors were encountered:

ink1 · 2021-04-09T19:01:29Z

I don't think it is normal (I'm in the same boat). What I noticed is that if I improve FID from, say, 100 to 50, it jumps back to 90 or so upon resume. To be clear, FID will be 50 immediately on resume but then veer off to 90 in several ticks and only then will start to gradually subside. So I strongly suspect this behaviour is due to training schedule, Adam or both. In stylegan2-ada, you could start with kimg 10000 or whatever thus entering fine-tuning regime. If anyone knows how to tweak the training schedule, please share! Thanks

ink1 · 2021-04-10T01:13:56Z

Sorry, meant to say in StyleGAN2 rather than StyleGAN2-ada
https://github.com/NVlabs/stylegan2/blob/master/training/training_loop.py#L132
which then go into the training_schedule
https://github.com/NVlabs/stylegan2/blob/master/training/training_loop.py#L47
But StyleGAN2-ada incl the pytorch version has no such thing as training schedule.
#3 is trying to address some resume issues but I am not convinced it addresses the main one: immediate divergence on resume.

snakch · 2021-04-10T10:32:42Z

Ok nice find! I actually think the above has a good chance of fixing the issue.

I suspect that the immediate divergence you're seeing is due to the augment strength being reset to 0 when resuming. The PR above seems to fix that. I'm going to give it a go and see where it takes me.

ink1 · 2021-04-10T15:28:05Z

I doubt that changing the augmentation strength is going to help in this case. But it is easy to test.
Perhaps EMA rampup could be more useful.

snakch · 2021-04-15T07:38:16Z

Just as a heads up - playing around with #3, I seem to be getting much better results, whether it's because of augmentation strength or one of the other changes.

snakch closed this as completed Apr 15, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fine tuning losses #89

Fine tuning losses #89

snakch commented Apr 9, 2021

ink1 commented Apr 9, 2021

ink1 commented Apr 10, 2021

snakch commented Apr 10, 2021

ink1 commented Apr 10, 2021

snakch commented Apr 15, 2021

Fine tuning losses #89

Fine tuning losses #89

Comments

snakch commented Apr 9, 2021

ink1 commented Apr 9, 2021

ink1 commented Apr 10, 2021

snakch commented Apr 10, 2021

ink1 commented Apr 10, 2021

snakch commented Apr 15, 2021