-
Notifications
You must be signed in to change notification settings - Fork 2.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Finetuning res101 model pretrained models error #557
Comments
this was fixed a while ago I believe.. I've successfully retrained res101 now |
Can you include your command-line argument and the full stack-trace? Also, what version of Python, PyTorch, CUDA, are you using? |
pytorch 0.4 Stack trace: Preparing training data... |
Could you also include the command-line argument? For example: |
I'm basically saying resume training for the load_name: (basically, I just changed lines 275-276 in trainval_net.py to Here's the only part I changed. Everything else is the same as the main branch. my default network is res101 and I'm training on coco2014. |
Could you try to add the following tag to your command: Additionally, you could try using all the command-line parameters: |
have you solved this problem ? @jsuit |
Hello, @AlexanderHustinx , I have the same problem Using res101 pretrained model with trainval_net.py. But using the same model with test_net.py and demo.py , there is no problem Using pytorch-1.0 branch |
Could you show the stack trace and error? |
Thanks. @AlexanderHustinx Using Called with args: Using Called with args:
|
Oef, okay this has been a while for me 😅 So you can successfully run the test_net.py and demo.py. |
Thank you! |
Have you made any changes to the original resnet.py, faster_rcnn.py, or trainval_net.py? |
No, Just download pascal_voc and res101_caffe and faster_1_6_10021.pth. Using pytorch-1.0 branch |
Can you try and train any model for 1 epoch and then continue training it afterwards? |
Using my own dataset train from scratch and resume from any epoch ,It's ok. user@user:/data2/CZY/data/faster-rcnn.pytorch/data/VOCdevkit2007$ tree -d
.
├── annotations_cache
├── local
│ ├── VOC2006
│ └── VOC2007
├── results
│ ├── VOC2006
│ │ └── Main
│ └── VOC2007
│ ├── Layout
│ ├── Main
│ └── Segmentation
├── VOC2007
│ ├── Annotations
│ ├── ImageSets
│ │ ├── Layout
│ │ ├── Main
│ │ └── Segmentation
│ ├── JPEGImages
│ ├── SegmentationClass
│ └── SegmentationObject
└── VOCcode Ubuntu 16.04 Here, I have another question: |
I assume it is indeed something with the configs, maybe it's related to the number of anchor scales and ratios you're using?
Regarding this question, you won't know for sure until you try. It is possible that your model will not mind if trained enough. |
Thanks. Could you have another way to do data augment? |
You are correct. It depends on your dataset, e.g. if you have a dataset of only small objects you'll want smaller anchor scales; if there are only big objects using larger scales will suffice; when the objects range from small to large, using more scales likely benefits your performance. Note that when changing the scales (for PASCAL VOC by default:
This really depends on your dataset, the expected orientation of the objects etc. Faster RCNN is not rotation invariant, so if you want to find e.g. a quokka (as in your example) in cases where it is standing and it is laying, you need to train on both cases. e.g. by rotating the image (though just more data is obviously better) An easy and almost always valid data augmentation for object detection is (horizontal) flipping. |
I happend another question,could you help me? |
You'll need to have a look at the An example of adding a dataset that uses the PASCAL VOC eval metrics can be found here: Alternatively, you could have a look at: |
Thank you very much for your help! update: |
You're right, there is no actual validation set used. The |
Thank you! I will have a try. |
I have a try with this code in trainval_net.py 1.I want to run a validation every 500 steps. But all val data are needed to be run at a validation? Or just one batch size one validation?
errors:
|
I would run the validation using a batch size of 1 instead of your normal batch size, that more closely resembles the results you would get during test time.
3?. You need to make sure that your training set doe not include your validation set, otherwise the added value of the validation set is negated. |
Hi, did you solve the problem for validation? |
I think you may try these codes in trainval_net.py. It seems to work :
|
So if you try and retrain the resnet101 (faster_rcnn_1_10_9771.pth) you get the following error:
It's the optimizer version knowledge of the parameters is different than what it actually is. Also, you get the same error if you use, /home/jonathan/faster-rcnn.pytorch/faster_rcnn_1_6_9771.pth
The text was updated successfully, but these errors were encountered: