get same loss on different GPU device #14194
Answered
by
rohitgr7
dichencd
asked this question in
DDP / multi-GPU / multi-node
-
Hi, I tried to set up my model training with 4 gpus using Lightning DDPPlugin. During my debugging process, I found that the data loaded are different on different devices, but a variable I printed out in the loss calculation seems to have the same value on different gpus devices (same up to 4 digits). Is this normal? If so, what is the reason? If not, what might go wrong with my network/dataloader? I really appreciate your comments and help! |
Beta Was this translation helpful? Give feedback.
Answered by
rohitgr7
Aug 14, 2022
Replies: 1 comment
-
you can manually check out the targets and outputs used to compute the loss. |
Beta Was this translation helpful? Give feedback.
0 replies
Answer selected by
dichencd
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
you can manually check out the targets and outputs used to compute the loss.