RuntimeError: CUDA error: device-size assert triggered #9466
Unanswered
rentainhe
asked this question in
DDP / multi-GPU / multi-node
Replies: 2 comments 1 reply
-
I've tried some other settings
|
Beta Was this translation helpful? Give feedback.
1 reply
-
#- [ ] - - [x] ***@@ |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Hello! I have some problem when I'm using Data-parallel Multi-GPU Training with Pytorch-Lightning
Here's my trainer code:
I have about 8 GPUS:
CUDA_VISIBLE_DEVICES: [0,1,2,3,4,5,6,7]
I just want to train my model on specific GPUs
1,2,3,4
gpus='1,2,3,4'
, it turns out:gpus='0,1,2,3'
, there's no problemsgpus='1,2'
, there's no problemsI'm very confused, I need some help, thanks a lot!
Beta Was this translation helpful? Give feedback.
All reactions