Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NAN or INF problem #29

Open
AbyssHedgehog opened this issue May 21, 2023 · 1 comment
Open

NAN or INF problem #29

AbyssHedgehog opened this issue May 21, 2023 · 1 comment

Comments

@AbyssHedgehog
Copy link

Dear author,recently I tried to run your code on my sever. However, I found the 'NAN or INF found in tensor' when I start training.
I used torch==1.4.0,cudatoolkit==10.1,torchvision==0.5.0 and the others are same to the requirements.txt.
I changed GPUS: '0,1' to GPUS: '0' and NUM_DATA: 1000 to NUM_DATA: 500.
I trained the model for 30 epoch, but it still shows 'NAN or INF found in tensor'.
作者您好,最近我尝试在服务器上运行您的代码,但是始终显示 'NAN or INF found in tensor' 。
我使用的配置和您写的一样,torch==1.4.0,cudatoolkit==10.1,torchvision==0.5.0等等。
由于一些原因,我将多GPU并行处进行了修改,只有一块GPU;同时从1000减少了NUM DATA到500。
我尝试训练到了epoch 30,但是始终显示 'NAN or INF found in tensor'。
在其他人的Issues里,我看到您似乎已经解决了这个问题,这是解决问题的新代码吗?

QQ浏览器截图20230521112153

@AlvinYH
Copy link
Owner

AlvinYH commented Jul 23, 2023

Thanks for your interest in our work. We've modified the code and you can pull the recent release. The NaN problem sometimes occurs when no valid proposal is detected in HDN, resulting in computing loss between None tensors. We have added a conditional expression to solve this problem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants