gpu=1, but code can run in 2 gpus. #16289
Unanswered
WindSmileValley
asked this question in
DDP / multi-GPU / multi-node
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Hi, I meet one strange question.
case 1:
if sei_trainer = Trainer(
gpus=2,
strategy='dp')
and
self.model = torch.nn.DataParallel(self.model,device_ids = [0,1]) # which is in a LightningModule class
there is question " RuntimeError: Expected tensor for argument #1 'input' to have the same device as tensor for argument #2 'weight'; but device 1 does not equal 0 (while checking arguments for cudnn_convolution)
but
case 2:
if sei_trainer = Trainer(
gpus=1,
strategy='dp')
and
self.model = torch.nn.DataParallel(self.model,device_ids = [0,1]) # which is in a LightningModule class
the code can run in 2 gpus.
I can not understant why this is happen and how to fix it.
Thank you all.
Beta Was this translation helpful? Give feedback.
All reactions