Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Empty Detections #6

Open
mtlouie-unm opened this issue Jul 16, 2019 · 31 comments
Open

Empty Detections #6

mtlouie-unm opened this issue Jul 16, 2019 · 31 comments
Labels
good first issue Good for newcomers

Comments

@mtlouie-unm
Copy link

When executing test_frcnn.py it seems that I pass in the path to where my test images are located, but I get zero detections when testing the model that was successfully trained. Wouldn't the testing phase need labeled data to see how well the model detects?

@kentaroy47
Copy link
Owner

right now, test.py just generates images with detection results. (with --write options).
I guess you want to see the calculated mAP? I will work with that feature so plz wait..

@mtlouie-unm
Copy link
Author

Thanks for responding.

Oh it seems that the network is classifying each of the test images as simply background (bg). Am I supposed to give some training images of just background?

If so, in order to give the network to train on images of just background would my simple data text file look like this:

/data/imgs/img_001.jpg,837,346,981,456,cow
/data/imgs/img_002.jpg,215,312,279,391,cat
/data/imgs/img_002.jpg,22,5,89,84,bird
/data/imgs/img_003.jpg,,,,,

Where the last line (/data/imgs/img_003.jpg,,,,,) would be an example of background.

@franyoadam
Copy link

Hi,

I have same problem, the training results seemed good but all test images are empty detection. If I tested on train dataset, the result was same. Have you managed to fix this?

@kentaroy47
Copy link
Owner

@mtlouie-unm @franyoadam
I think this issue is fixed by the past commits.
Pulling the newest git should pass this issue,

git pull

@kentaroy47 kentaroy47 added the bug Something isn't working label Aug 23, 2019
@tossy-yossy
Copy link

I have the same problem.
I am using the latest code, but all images are empty detection.

@kentaroy47
Copy link
Owner

kentaroy47 commented Sep 9, 2019

@tossy-yossy
Yes, I still found this issue when trained with pascal2007 images.
since it was working previously, let me revert my enviroment (tf+keras) back and check if it cures.
if it's urgent, I recommend using pytorch object detection repos which supports coco training as well.
https://github.com/jwyang/faster-rcnn.pytorch
https://github.com/kentaroy47/ObjectDetection.Pytorch

(I have a paper submission this week and get back after that's finished)

@kentaroy47
Copy link
Owner

kentaroy47 commented Sep 18, 2019

@tossy-yossy @mtlouie-unm @franyoadam
This issues have been fixed.
The cause was the positive-negative ratio of the detections.
I fixed the number of RPNs so that the postive-negative ratio will be about 1:3, as in other implementations. Please pull the newest version to activate this.

The training will be stable with using pretrained RPN models.

The pretrained RPN for VGG is uploaded to:
https://drive.google.com/file/d/1teuXIRN4mvmbnIfWlxAAM69hJEceTpUm/view?usp=sharing
The trained vgg frcnn model is uploaded (is underfitting..):
https://drive.google.com/file/d/1IgxPP0aI5pxyPHVSM2ZJjN1p9dtE4_64/view?usp=sharing

Here is the example command.

python train_frcnn.py --network vgg -p to/your/voc --load rpn-mode.pth

@ianstath
Copy link

@tossy-yossy @mtlouie-unm @franyoadam
This issues have been fixed.
The cause was the positive-negative ratio of the detections.
I fixed the number of RPNs so that the postive-negative ratio will be about 1:3, as in other implementations. Please pull the newest version to activate this.

The training will be stable with using pretrained RPN models.

The pretrained RPN for VGG is uploaded to:
https://drive.google.com/file/d/1teuXIRN4mvmbnIfWlxAAM69hJEceTpUm/view?usp=sharing
The trained vgg frcnn model is uploaded (is underfitting..):
https://drive.google.com/file/d/1IgxPP0aI5pxyPHVSM2ZJjN1p9dtE4_64/view?usp=sharing

Here is the example command.

python train_frcnn.py --network vgg -p to/your/voc --load rpn-mode.pth

Hello,
I have the same problem. I trained the whole network (i didnt pretrained the rpn) with mobilenetv2.
Should I change something at test.py or train.py? Any Ideas? I pulled the latest code

@kentaroy47
Copy link
Owner

kentaroy47 commented Sep 23, 2019

@ianstath
if the mean object per image is under 2 during training, you should pretrain the rpn and use it to train frcnn. or you may train with option -n 6, which will reject more negative proposals.
I haven't tried pascal_voc with mobilenetv2, but I'm thinking that is the case..

@ianstath
Copy link

ianstath commented Sep 23, 2019

@ianstath
if the mean object per image is under 2 during training, you should pretrain the rpn and use it to train frcnn. or you may train with option -n 6, which will reject more negative proposals.
I haven't tried pascal_voc with mobilenetv2, but I'm thinking that is the case..

mean object per image? you mean one finding per image?Yes, in most o the Images, I have only one object to be detected.
Train with -n 6 the rpn or the whole? Now the default is -n 10. Will it make a difference? What if I trained only the mobilenet? Also, I am using my own dataset of images( images of brain MRI with tumors)

Also, I thought that -n 6 controls the size of the batch. If not, how I control the batch size in order not to run out of memory

@kentaroy47
Copy link
Owner

@ianstath

python train_frcnn.py --network mobilenetv2  -p ../VOCdevkit/ -n 6

may help. I haven't tried on mobilenet so would help if you can.
Thanks.

@kentaroy47 kentaroy47 added the good first issue Good for newcomers label Sep 26, 2019
@VincentDecospan
Copy link

I'm also encountering the empty detections issue. Trained both the RPN & detection network on my own dataset using vgg pretrained weights. Do you have any idea on how to solve this issue?

@ianstath
Copy link

ianstath commented Oct 1, 2019

I'm also encountering the empty detections issue. Trained both the RPN & detection network on my own dataset using vgg pretrained weights. Do you have any idea on how to solve this issue?

For me, it was matter of dataset and overfitting. The resnet50 seems to be quite big for a dataset of 240 images.
So i tried vgg with data augmentation and I added dropout layers. I am still on the train but the first results are very good. I can detect regions of interest in the most pictures of the test set.
So how many samples are you using?

@VincentDecospan
Copy link

I'm also encountering the empty detections issue. Trained both the RPN & detection network on my own dataset using vgg pretrained weights. Do you have any idea on how to solve this issue?

For me, it was matter of dataset and overfitting. The resnet50 seems to be quite big for a dataset of 240 images.
So i tried vgg with data augmentation and I added dropout layers. I am still on the train but the first results are very good. I can detect regions of interest in the most pictures of the test set.
So how many samples are you using?

This was really just a testrun on 2 of the like 10 classes I want to detect eventually.
Trained the network on like 3K images/class, so a total of 6K.
Where did you alter the code in order to add the data augmentation and dropout layers?

Also; did you manage to extract the ROI's proposed by the RPN alone?
I'm working on this code as we speak.

@kentaroy47
Copy link
Owner

kentaroy47 commented Oct 8, 2019

I will give some tips for training frcnns.

Check whether the rpn_cls or detector_cls losses aren't too high (>1 after training is quite high).

  • if the rpn is bad, check if the rpn trains well with rpn only training. If it doesn't train well with rpn alone, your train data may be bad.

  • Always use imagenet pretrained weights for the backbone. Improves convergence.

  • Misdetections (or empty detections) occur when the detector isn't training very well. you may want to change the detector layers and making them simpler since they may be underfitting. see vgg.py or resnet.py implementations.

@jia101728
Copy link

Hello,
I am using my own dataset of images to train the network. When executing test_frcnn.py, the test images are empty detection.And i pulled the newest version. My data set form is the data form of PASCAL_VOC. So the command is
python test_frcnn.py --network resnet50 -p ./test_img --write
But the result is
22.jpg
(133, 4)
Elapsed time = 0.6575415134429932
[]
{}
Do you have any idea on how to solve this issue?

@ianstath
Copy link

ianstath commented Oct 8, 2019

I'm also encountering the empty detections issue. Trained both the RPN & detection network on my own dataset using vgg pretrained weights. Do you have any idea on how to solve this issue?

For me, it was matter of dataset and overfitting. The resnet50 seems to be quite big for a dataset of 240 images.
So i tried vgg with data augmentation and I added dropout layers. I am still on the train but the first results are very good. I can detect regions of interest in the most pictures of the test set.
So how many samples are you using?

This was really just a testrun on 2 of the like 10 classes I want to detect eventually.
Trained the network on like 3K images/class, so a total of 6K.
Where did you alter the code in order to add the data augmentation and dropout layers?

Also; did you manage to extract the ROI's proposed by the RPN alone?
I'm working on this code as we speak.

check the parsers opotions for data augmentation. Also the vgg.py or resnet.py for adding dropout or make any change to vase networks.

I didnt make to extact the ROI's of RPN. I only cropped the proposals of the test.py.
There a lot to be done to the model (i.e checkpoints to the train_frcnn.py or validation loss or mAP addition)

@kentaroy47 kentaroy47 pinned this issue Oct 23, 2019
@ambigus9
Copy link

ambigus9 commented Oct 24, 2019

@kentaroy47 I'm getting also Empty Detections, Any idea to solve it?

Which is a good loss in the RPN at the end of the training process?

This is the code i'm using in the test step:

!python test_frcnn.py --network mobilenetv2 -p test/ --load models/mobilenetv2/voc.hdf5 --write

Here is a sample of the ouput:

Using TensorFlow backend.
2019-10-24 03:13:17.835487: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
{0: 'grieta', 1: 'corrocion', 2: 'bg'}
Loading weights from models/mobilenetv2/voc.hdf5
frame1609.png
(300, 4)
[[[1.12452474e-03 4.83090553e-05 9.98827159e-01]
[9.25188791e-03 2.89127696e-04 9.90458965e-01]
...
[2.4799172e-02 2.8327608e-04 9.7491759e-01]]]
Elapsed time = 9.977138996124268
[]
{}

@Aymdr
Copy link

Aymdr commented Nov 6, 2019

@kentaroy47 I'm getting also Empty Detections, Any idea to solve it?

Which is a good loss in the RPN at the end of the training process?

This is the code i'm using in the test step:

!python test_frcnn.py --network mobilenetv2 -p test/ --load models/mobilenetv2/voc.hdf5 --write

Here is a sample of the ouput:

Using TensorFlow backend.
2019-10-24 03:13:17.835487: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
{0: 'grieta', 1: 'corrocion', 2: 'bg'}
Loading weights from models/mobilenetv2/voc.hdf5
frame1609.png
(300, 4)
[[[1.12452474e-03 4.83090553e-05 9.98827159e-01]
[9.25188791e-03 2.89127696e-04 9.90458965e-01]
...
[2.4799172e-02 2.8327608e-04 9.7491759e-01]]]
Elapsed time = 9.977138996124268
[]
{}

In the train_frcnn.py, from R = roi_helpers.rpn_to_roi(P_rpn[0], P_rpn[1], C, K.image_dim_ordering(), use_regr=True, overlap_thresh=0.4, max_boxes=300) the overlap_thresh is 0.4 ,I think it is a mistake and it lead to the high score in 'bg',I met same problem and when I change it to 0.7, it solved. Maybe you can have a try.

@ambigus9
Copy link

ambigus9 commented Nov 6, 2019

@Aymdr So do you recommend me to incress the overlap_thresh=0.4 to overlap_thresh=0.7?

@Aymdr
Copy link

Aymdr commented Nov 6, 2019

@Aymdr So do you recommend me to incress the overlap_thresh=0.4 to overlap_thresh=0.7?

In my training yesterday, in train_frcnn.py, i change the overlap_thresh = 0.9, and in test_frcnn.py ,I change the overlap_thresh = 0.7, bbox_threshold = 0.8, and I do not train the RPN, the result is not empty and have a good effect

@ignvinay
Copy link

I tried recommended steps as above, but i am also getting empty detections for resnet50.
Any ideas, on how to make this work ?

@jia101728
Copy link

@Aymdr So do you recommend me to incress the overlap_thresh=0.4 to overlap_thresh=0.7?

Is the problem solved? i have the same problem.

@alessandrobetti
Copy link

Hi all, I have the same problem as above: performances with backbone VGG16 are quite good, but they are much lower when using resnet50 or IRV2 as backbones.
In particular I obtain very often empty detections for both training and test sets, even if accuracy in training is very high (up to 98% for resnet50 and 99% for IRV2) and training loss is well below 1.
I have tried all of the suggestions presented above (for example changing overlap_thresh or num_rois, etc ) but nothing solve the problem.
I execute training in two steps as suggested, training first the RPN and then the whole architecture.
In particular, by means of debugging, it seems that RPN works very well whereas the classifier network after every batch of 1 image has good detection capabilities on the last batches, but it forgets after few iterations.
Thus it seems that catastrophic forgetting takes place, which indeed may affect online learning as in this case (the weights are updated after every image is presented to the network).
Any ideas on how to solve this issue?

@linaemunsamy
Copy link

Hi all, as anyone tried any hyperparameter tuning methods? such as grid search?

@kentaroy47
Copy link
Owner

kentaroy47 commented Aug 26, 2020 via email

@yellowjs0304
Copy link

yellowjs0304 commented Oct 4, 2020

@kentaroy47 still have the same issue....
I can't get any bbox even in the training set...

@yellowjs0304
Copy link

yellowjs0304 commented Oct 4, 2020

Hi all, I have the same problem as above: performances with backbone VGG16 are quite good, but they are much lower when using resnet50 or IRV2 as backbones.
In particular I obtain very often empty detections for both training and test sets, even if accuracy in training is very high (up to 98% for resnet50 and 99% for IRV2) and training loss is well below 1.
I have tried all of the suggestions presented above (for example changing overlap_thresh or num_rois, etc ) but nothing solve the problem.
I execute training in two steps as suggested, training first the RPN and then the whole architecture.
In particular, by means of debugging, it seems that RPN works very well whereas the classifier network after every batch of 1 image has good detection capabilities on the last batches, but it forgets after few iterations.
Thus it seems that catastrophic forgetting takes place, which indeed may affect online learning as in this case (the weights are updated after every image is presented to the network).
Any ideas on how to solve this issue?

@alessandrobetti hi, did you get any way to solve the issue?
I have same issue.

@alessandrobetti
Copy link

alessandrobetti commented Dec 13, 2020

@yellowjs0304
Hi, unfortunately I have not solved yet the issue.

@FrancescoManigrass
Copy link

Hi all, I have the same problem as above: performances with backbone VGG16 are quite good, but they are much lower when using resnet50 or IRV2 as backbones.
In particular I obtain very often empty detections for both training and test sets, even if accuracy in training is very high (up to 98% for resnet50 and 99% for IRV2) and training loss is well below 1.
I have tried all of the suggestions presented above (for example changing overlap_thresh or num_rois, etc ) but nothing solve the problem.
I execute training in two steps as suggested, training first the RPN and then the whole architecture.
In particular, by means of debugging, it seems that RPN works very well whereas the classifier network after every batch of 1 image has good detection capabilities on the last batches, but it forgets after few iterations.
Thus it seems that catastrophic forgetting takes place, which indeed may affect online learning as in this case (the weights are updated after every image is presented to the network).
Any ideas on how to solve this issue?

did you solved ? i have the same issue, but increasing the max bounding box retrieved from RPN the mAP increased to 60.

@ghost
Copy link

ghost commented Jun 25, 2021

@Aymdr @ambigus9 @kentaroy47 no detection i change threshold value to 0.7 still not able to detect.. help me.. and do please about map

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue Good for newcomers
Projects
None yet
Development

No branches or pull requests