Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

If any other DIRE image preprocessing methods? #12

Open
RichardSunnyMeng opened this issue Sep 5, 2023 · 17 comments
Open

If any other DIRE image preprocessing methods? #12

RichardSunnyMeng opened this issue Sep 5, 2023 · 17 comments

Comments

@RichardSunnyMeng
Copy link

I run ./DIRE/guided-diffusion/compute_dire.py using 256x256_diffusion_uncond.pt on my dataset and get the following images (source, recon and dire):
MY_0
MY_0_recon
MY_0_dire

and other results are all similar. As we can see, DIRE is not as significant as DF. I would like to know whether there are any other preprocessing methods for DIRE images?

@eecoder-dyf
Copy link

Are you sure these images are correct?
My results are more close to the author's results though.
00019
00019
00019

@RichardSunnyMeng
Copy link
Author

Are you sure these images are correct? My results are more close to the author's results though. 00019 00019 00019

Do you also use 256x256_diffusion_uncond.pt ?

@eecoder-dyf
Copy link

Are you sure these images are correct? My results are more close to the author's results though. 00019 00019 00019

Do you also use 256x256_diffusion_uncond.pt ?

Yes

@RichardSunnyMeng
Copy link
Author

Are you sure these images are correct? My results are more close to the author's results though. 00019 00019 00019

Do you also use 256x256_diffusion_uncond.pt ?

Yes

But These models sometimes produce highly unrealistic outputs, particularly when generating images containing human faces. This may stem from ImageNet's emphasis on non-human objects. in guided-diffusion modal-card.

@eecoder-dyf
Copy link

I also tried to compress the original images in ImageNet and get the results

ILSVRC2012_val_00045880
ILSVRC2012_val_00045880

@Rapisurazurite
Copy link

I run ./DIRE/guided-diffusion/compute_dire.py using lsun_bedroom.pt, using image 0022fa605ffee31d4147bcbec6d42066d7bce8b7.jpg in test dataset of lsun_bedroom,
0022fa605ffee31d4147bcbec6d42066d7bce8b7
0022fa605ffee31d4147bcbec6d42066d7bce8b7
0022fa605ffee31d4147bcbec6d42066d7bce8b7
And the author provide images:
0022fa605ffee31d4147bcbec6d42066d7bce8b7

0022fa605ffee31d4147bcbec6d42066d7bce8b7

@RichardSunnyMeng
Copy link
Author

I found that if real_step != 0 and use_ddim=True then can get my dire images. Under original config, I can also get similar images with you but it is too slow...

@RichardSunnyMeng
Copy link
Author

I found that if real_step != 0 and use_ddim=True then can get my dire images. Under original config, I can also get similar images with you but it is too slow...

As Table 4 shows, real_step = 0 is not necessary. 20 or 50 is enough.
1693894318500

@jS5t3r
Copy link

jS5t3r commented Oct 26, 2023

I run ./DIRE/guided-diffusion/compute_dire.py using 256x256_diffusion_uncond.pt on my dataset and get the following images (source, recon and dire): MY_0 MY_0_recon MY_0_dire

and other results are all similar. As we can see, DIRE is not as significant as DF. I would like to know whether there are any other preprocessing methods for DIRE images?

your setup is not clear. Did you take .png as input image? your output seems to be perfect.

@ciodar
Copy link

ciodar commented Nov 15, 2023

As Table 4 shows, real_step = 0 is not necessary. 20 or 50 is enough. 1693894318500

The table is varying the number of DDIM steps, not real_step. real_step truncates the diffusion process at a certain timestep, while using the original noise schedule. If you are not using DDIM and setting a low value with real_step, basically you're not performing inversion (the image is not properly inverted into the model's latent space, since you are adding very little predicted noise, expecially if T=1000, and then removing it) and the result may be perfect or almost perfect, since the networks just has to remove a small amount of noise from the image.

@RichardSunnyMeng
Copy link
Author

As Table 4 shows, real_step = 0 is not necessary. 20 or 50 is enough. 1693894318500

The table is varying the number of DDIM steps, not real_step. real_step truncates the diffusion process at a certain timestep, while using the original noise schedule. If you are not using DDIM and setting a low value with real_step, basically you're not performing inversion (the image is not properly inverted into the model's latent space, since you are adding very little predicted noise, expecially if T=1000, and then removing it) and the result may be perfect or almost perfect, since the networks just has to remove a small amount of noise from the image.

Oh, got it! Very helpful, thank you!

@Foglia-m
Copy link

Foglia-m commented Jan 8, 2024

Hi there!
For any of you guys that implemented it on low VRAM gpus (I'm using a 2080Ti), do you have any knowledge to share on minimum parameters to have usable reconstructed images? I see that the default number of samples is 1000 but it takes me forever to create the recons with such sample number. With lower value (50 for instance), the reconstruction is a full black image and somehow outputs synthetic for real images recons (probs of being synthetic = 1) which is a bug I guess.
Are there any parameters I should tweak?
Thank you

@happy-Moer
Copy link

Have you ever encountered this problem: ModuleNotFoundError: No module named 'mpi4py'
How can I slove this?
thank you in advance

@mariannakubsik
Copy link

Have you ever encountered this problem: ModuleNotFoundError: No module named 'mpi4py' How can I slove this? thank you in advance

In your conda environment, you need to install the module with: conda install -c anaconda mpi4py

@jungkoak
Copy link

jungkoak commented Jul 17, 2024

Hello, may I ask if the SAMPLE_FLAGS and MODEL_FLAGS in your compute_dire.sh are modified to the 256x256 model (unconditional) provided in the readme:
MODEL_FLAGS="--attention_resolutions 32,16,8 --class_cond False --diffusion_steps 1000 --image_size 256 --learn_sigma True --noise_schedule linear --num_channels 256 --num_head_channels 64 --num_res_blocks 2 --resblock_updown True --use_fp16 True --use_scale_shift_norm True"
or just as same as the compute_dire.sh:
MODEL_FLAGS="--attention_resolutions 32,16,8 --class_cond False --diffusion_steps 1000 --dropout 0.1 --image_size 256 --learn_sigma True --noise_schedule linear --num_channels 256 --num_head_channels 64 --num_res_blocks 2 --resblock_updown True --use_fp16 True --use_scale_shift_norm True"?

Translated with DeepL.com (free version)

@XCQ001204
Copy link

您确定这些图像是正确的吗?不过,我的结果更接近作者的结果。00019 00019 00019

您是否还使用 ?256x256_diffusion_uncond.pt

May I ask if you are using distributed training or training on a single GPU?Will there be an error indicating mismatched model parameters when using 256x256_diffusion_uncond.pt? Mine reports such an error.Thanks very much!

@Silent-glass-of-leaves
Copy link

您确定这些图像是正确的吗?不过,我的结果更接近作者的结果。00019 00019 00019

您是否还使用 ?256x256_diffusion_uncond.pt

May I ask if you are using distributed training or training on a single GPU?Will there be an error indicating mismatched model parameters when using 256x256_diffusion_uncond.pt? Mine reports such an error.Thanks very much!

you can substitute the settings for the original ones like this:
image
image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests