how to sync LightningModule.log value, to let EarlyStoping, ModelCheckpoint callbacks work correctly? #14806
Unanswered
peiliu0408
asked this question in
DDP / multi-GPU / multi-node
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
(pytorch_lightning 1.7.6, 2 GPUs, the validation metric (val_map) is got in validation_epoch_end on cpu mode.
` def validation_epoch_end(self, results):
errors:
the ModelCheck(monitor='val_ap') could not find the monitored key in the returned metrics: ['step', 'epoch'].
how can I to let model only log on rank 0 after each validation epoch, and still can compatible other callbacks?
thanks.
Beta Was this translation helpful? Give feedback.
All reactions