Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to train on GPU? #13

Open
CarlSchwedes opened this issue Aug 1, 2019 · 3 comments
Open

How to train on GPU? #13

CarlSchwedes opened this issue Aug 1, 2019 · 3 comments

Comments

@CarlSchwedes
Copy link

CarlSchwedes commented Aug 1, 2019

Training on GPU is not working for me by setting -gpu 0,1,2 command line options.

./scripts/train.sh -gpu 0 -image_set train -log_dir ./log/

I'm running SqueezeSegV2 on a conda virtual environment with tensorflow-gpu version 1.4.1

$ pip list | grep tensorflow
tensorflow-estimator 1.14.0
tensorflow-gpu 1.4.1
tensorflow-tensorboard 0.4.0

By invoking the training script the GPU remains mainly unused

$ nvidia-smi
Thu Aug 1 15:50:58 2019
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 430.26 Driver Version: 430.26 CUDA Version: 10.2 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GeForce RTX 2070 Off | 00000000:15:00.0 Off | N/A |
| 29% 30C P8 14W / 175W | 107MiB / 7982MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 1 Quadro P1000 Off | 00000000:21:00.0 On | N/A |
| 34% 40C P8 N/A / N/A | 507MiB / 4030MiB | 4% Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 26958 C python 93MiB |
| 1 1599 G /usr/bin/gnome-shell 92MiB |
| 1 2379 G /usr/bin/gnome-shell 392MiB |
+-----------------------------------------------------------------------------+

I expected GPU training to be working out of the box. What am I missing?

@MoonWolf9067
Copy link

Your gpu is out of memory!!!

@CarlSchwedes
Copy link
Author

Could you please specify?
107MiB / 7982MiB doesn't look like an out of memory for me.?!
I'm also not getting any CUDA out of memory errors since I reduced batch-sizes.

Any help would be highly appreciated!

@Sirius114515
Copy link

Could you please specify?
107MiB / 7982MiB doesn't look like an out of memory for me.?!
I'm also not getting any CUDA out of memory errors since I reduced batch-sizes.

Any help would be highly appreciated!

Did you finally solve the problem
I also experienced an out of memory error
Could you give me some suggestions about it

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants