[Feature request] DiVAE #151

jordiae · 2023-01-11T18:34:27Z

DiVAE [1] uses a VQ encoder and a diffusion decoder. Unfortunately, there's no public implementation. It would also be nice to combine that with diffusion Transformers [2].

Any way many thanks for all your work!

[1] https://arxiv.org/abs/2206.00386
[2] https://arxiv.org/abs/2212.09748

lucidrains · 2023-01-11T18:49:05Z

@jordiae i think SOTA for diffusion transformers would be Muse

i'll take a look at DiVAE this weekend, thanks!

jordiae · 2023-01-11T18:51:10Z

@jordiae i think SOTA for diffusion transformers would be Muse

i'll take a look at DiVAE this weekend, thanks!

The main difference is that in DiVAE the decoder of the image "tokenizer" is a diffusion model. Thanks!

Edit: This should be better than VQGAN (see Table 1 in https://arxiv.org/pdf/2206.00386.pdf)

lucidrains · 2023-01-11T21:32:29Z

@jordiae i think SOTA for diffusion transformers would be Muse
i'll take a look at DiVAE this weekend, thanks!

The main difference is that in DiVAE the decoder of the image "tokenizer" is a diffusion model. Thanks!

Edit: This should be better than VQGAN (see Table 1 in https://arxiv.org/pdf/2206.00386.pdf)

oh my, it is like a frankenstein haha

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature request] DiVAE #151

[Feature request] DiVAE #151

jordiae commented Jan 11, 2023

lucidrains commented Jan 11, 2023

jordiae commented Jan 11, 2023 •

edited

Loading

lucidrains commented Jan 11, 2023

[Feature request] DiVAE #151

[Feature request] DiVAE #151

Comments

jordiae commented Jan 11, 2023

lucidrains commented Jan 11, 2023

jordiae commented Jan 11, 2023 • edited Loading

lucidrains commented Jan 11, 2023

jordiae commented Jan 11, 2023 •

edited

Loading