all_gather produces garbage result #17068
Unanswered
gfx73
asked this question in
DDP / multi-GPU / multi-node
Replies: 1 comment
-
negative torch dimension may due to precision overflow , you can use LongTensor instead of default IntTensor |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Can you please help me with all_gather function? It somehow produces negative values.
Here is the code I have in on_train_epoch_end function:
I explicitly check that self.query_labels doesn't have negative values. But the prints are as follows:
What are the possible reasons for such behavior?
Beta Was this translation helpful? Give feedback.
All reactions