Training Hypterparameters of PixelSnail for VQ-VAE experiments #49

fostiropoulos · 2020-10-08T06:30:37Z

I am using 4x Nvidia V100 and I am not able to get a batch size larger than 32 for the hyperparameters of this paper for training on the top codes. I have also changed the loss to discretized mixtures of logistics similar to the actual PixelCNN++ and PixelSnail implementation. The authors mention a batch size of 1024 which seems unreal to reach. Does this implementation of PixelSnail use more layers than the one reported in the VQVAE2 paper?

I am not able to make the mapping between this implementation and the one described in the appendix of VQVAE 2 to correctly configure it to replicate their results. Any help appreciated.

rosinality · 2020-10-08T12:02:10Z

Actually the network used in the paper is much larger than the default model in this implementation.

fostiropoulos · 2020-10-08T14:53:14Z

Yes I would initially have thought so. I can only think of being able to train such a large model on a TPU. Do you have any insights on how it could have been done?

rosinality · 2020-10-08T15:35:20Z

Maybe they have used tpus or large amount of gpus. Anyway replicating the model training in the paper will be very hard (actually practically impossible) with a few number of gpus.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Training Hypterparameters of PixelSnail for VQ-VAE experiments #49

Training Hypterparameters of PixelSnail for VQ-VAE experiments #49

fostiropoulos commented Oct 8, 2020

rosinality commented Oct 8, 2020 •

edited

Loading

fostiropoulos commented Oct 8, 2020

rosinality commented Oct 8, 2020

Training Hypterparameters of PixelSnail for VQ-VAE experiments #49

Training Hypterparameters of PixelSnail for VQ-VAE experiments #49

Comments

fostiropoulos commented Oct 8, 2020

rosinality commented Oct 8, 2020 • edited Loading

fostiropoulos commented Oct 8, 2020

rosinality commented Oct 8, 2020

rosinality commented Oct 8, 2020 •

edited

Loading