validation_epoch_end with DDP #5808
Replies: 2 comments
-
Hey @adarsh-kr, Are you using Pytorch Lightning Metric API or implementing our own. Metric API should take care of sync across gpus. We use this function to sync values: https://github.com/PyTorchLightning/pytorch-lightning/blob/master/pytorch_lightning/utilities/distributed.py#L121 Would you mind creating a simple script with BoringModel to reproduce your use case, so we can assist you better. Best regards, |
Beta Was this translation helpful? Give feedback.
-
Is the outputs in If this property is guaranteed in |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
What is your question?
I am trying to implement a metric which needs access to whole data. So instead of updating the metric in
*_step()
methods, I am trying to collect the outputs in the*_epoch_end()
methods. However, the outputs contain only the output of the partition of the data each device gets. Basically if there aren
devices, then each device is getting1/n
of the total outputs.Stackoverflow Post
What's your environment?
Beta Was this translation helpful? Give feedback.
All reactions