Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sharing answer from the question "How many iterations do I need?" #121

Open
SeungyounShin opened this issue Nov 9, 2022 · 4 comments
Open

Comments

@SeungyounShin
Copy link

[Training Code]

model = Unet(
    dim = 64,
    dim_mults = (1, 2, 4, 8)
).cuda()

diffusion = GaussianDiffusion(
    model,
    image_size = 128,
    timesteps = 1000,           # number of steps
    sampling_timesteps = 250,   # number of sampling timesteps 
    loss_type = 'l1'            # L1 or L2
).cuda()

trainer = Trainer(
    diffusion,
    '/mnt/prj/seungyoun/dataset/flowers',
    train_batch_size = 128,
    train_lr = 1e-4,
    train_num_steps = 700000,         # total training steps
    gradient_accumulate_every = 2,    # gradient accumulation steps
    ema_decay = 0.995,                # exponential moving average decay
    amp = False                       # turn on mixed precision
)

trainer.train()

amp=True hinders training.

[Training progress]
image

@SeungyounShin SeungyounShin changed the title Sharing of question "How many iterations do I need?" Sharing answer from the question "How many iterations do I need?" Nov 9, 2022
@pengzhangzhi
Copy link
Contributor

Thank you so much for the information. Have you check out the quantitative metrics during the training? I am wondering if this would be possible: the quantitative results converge while the sampling quality still actually increases. If so, we can not deduce the training epochs by these quantitative results. Just enlarge the training loop as long as you can.

@dannalily
Copy link

Hello, amazing! Just wanna know that the "iteration" refers to "train_num_steps" or "timesteps".

@pengzhangzhi
Copy link
Contributor

I think it is “train_num_steps“.

@lwtgithublwt
Copy link

I think it is “train_num_steps“.

Hello, I would like to know what train_um_steps means and its relationship with epoch. Thank you for your answer

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants