RuntimeError: Expected to mark a variable ready only once - with .backward() in validation_step #13195
Unanswered
kampelmuehler
asked this question in
DDP / multi-GPU / multi-node
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Hi all,
I'm getting
RuntimeError: Expected to mark a variable ready only once
when running a model on more than one GPU (this is my first time using lightning with DDP).In my validation_step I need to loop through the model multiple times and run .backward() to get gradients - and this is what causes the error. The code looks something like this:
for this purpose I also set
It's exactly at
pred.sum().backward()
where the error is triggered.I've tried
find_unused_parameters=False
and a static graph is impossible - so the options the error message gives are exhausted. As mentioned earlier it works fine on a single GPU.Any hints on what causes that behavior and how it can be fixed?
I would also be fine with running validation on a single GPU, but I found that being impossible within lightning.
Beta Was this translation helpful? Give feedback.
All reactions