You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Dear author,recently I tried to run your code on my sever. However, I found the 'NAN or INF found in tensor' when I start training.
I used torch==1.4.0,cudatoolkit==10.1,torchvision==0.5.0 and the others are same to the requirements.txt.
I changed GPUS: '0,1' to GPUS: '0' and NUM_DATA: 1000 to NUM_DATA: 500.
I trained the model for 30 epoch, but it still shows 'NAN or INF found in tensor'.
作者您好,最近我尝试在服务器上运行您的代码,但是始终显示 'NAN or INF found in tensor' 。
我使用的配置和您写的一样,torch==1.4.0,cudatoolkit==10.1,torchvision==0.5.0等等。
由于一些原因,我将多GPU并行处进行了修改,只有一块GPU;同时从1000减少了NUM DATA到500。
我尝试训练到了epoch 30,但是始终显示 'NAN or INF found in tensor'。
在其他人的Issues里,我看到您似乎已经解决了这个问题,这是解决问题的新代码吗?
The text was updated successfully, but these errors were encountered:
Thanks for your interest in our work. We've modified the code and you can pull the recent release. The NaN problem sometimes occurs when no valid proposal is detected in HDN, resulting in computing loss between None tensors. We have added a conditional expression to solve this problem.
Dear author,recently I tried to run your code on my sever. However, I found the 'NAN or INF found in tensor' when I start training.
I used torch==1.4.0,cudatoolkit==10.1,torchvision==0.5.0 and the others are same to the requirements.txt.
I changed GPUS: '0,1' to GPUS: '0' and NUM_DATA: 1000 to NUM_DATA: 500.
I trained the model for 30 epoch, but it still shows 'NAN or INF found in tensor'.
作者您好,最近我尝试在服务器上运行您的代码,但是始终显示 'NAN or INF found in tensor' 。
我使用的配置和您写的一样,torch==1.4.0,cudatoolkit==10.1,torchvision==0.5.0等等。
由于一些原因,我将多GPU并行处进行了修改,只有一块GPU;同时从1000减少了NUM DATA到500。
我尝试训练到了epoch 30,但是始终显示 'NAN or INF found in tensor'。
在其他人的Issues里,我看到您似乎已经解决了这个问题,这是解决问题的新代码吗?
The text was updated successfully, but these errors were encountered: