Memory use different on both gpus using ddp #11692
Unanswered
surya-narayanan
asked this question in
DDP / multi-GPU / multi-node
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
I have noticed that there is often a discrepancy between the memory allocated and use in multi-gpu training (allocated shown above, use shown below). Why is that? When the gradients are synced across gpus in step four of this checklist (https://pytorch-lightning.readthedocs.io/en/stable/advanced/multi_gpu.html#distributed-data-parallel), do they use one gpu more than another?
Thanks for your time.
Beta Was this translation helpful? Give feedback.
All reactions