WandB Logger in Single Machine + Multi-GPU DDP setting #17225
Replies: 1 comment
-
@carmocca Can you help with this please? |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
I'm switching from MLFlowLogger to WandbLogger, and one problem I've run into is that when using > 1 GPUs, the logging for the WandBLogger breaks after the first GPU instance gets initialized. Reading the WandB documentation, it looks like the WandB Run object is not available on any rank > 0. So have these commands from the PL docs been tested in multi-gpu setting?
Because I keep getting these errors before Trainer.fit(...) is called:

This is happening using PyTorch-Lightning 2.0.0, PyTorch 1.13.1, on 2 V100 GPUs.
Beta Was this translation helpful? Give feedback.
All reactions