-
Notifications
You must be signed in to change notification settings - Fork 46
[feature] Add Region feature configs and extraction file #3
base: main
Are you sure you want to change the base?
Conversation
extract_feature_region.py
Outdated
# postprocess | ||
height = inputs[0].get("height") | ||
width = inputs[0].get("width") | ||
r = postprocessing.detector_postprocess(predictions[0], height, width) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I remember there is some tiny detail in getting the region features exactly right in the original code base, I remember Aman and myself have discussed before. Is it exactly following the setting of previously extracted region features?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please discuss with Aman about this, being able to exactly reproduce the previous procedure to extract box features is most important thing for this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't seen any major difference in the performance yet compared to old features. However I will check if we can get the features before the box head regression and if it helps the performance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cc @apsdehal Do you know what was the difference in extracting features before and after box regression?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just did a bit digging and think the original one uses the proposal boxes directly before second regression. https://github.com/facebookresearch/mmf/blob/4517e13f8407f95840340e90a6d30df344d08ea9/tools/scripts/features/extract_features_vmb.py#L184
I think this is more than observed performance difference -- if there is any subtle difference we are aware of that can cause a difference, it should be stated at least. And past experience suggest that having the additional box regression does not help, and can sometimes hurt the accuracy. So for a faithful implementation (ideally it should be able to extract the original BUTD features), I still prefer for it to be implemented just as what is being done before.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have added ability to extract features from both last FC (fc7) and second last FC (fc6) layers for FPN and DC5 models. Also now we use proposal bbox instead of predicted boxes exactly similar to old pythia implementation.
I am still keeping ability to have last layer features because current BERT based models don't finetune last layer and hence should use last layer FC features.
@endernewton This is ready for review again. Uploaded all models, metrics and features to AWS. Readme is updated as well. |
predictions = model.roi_heads.box_predictor(pooled_features) | ||
cls_probs = F.softmax(predictions[0], dim=-1) | ||
cls_probs = cls_probs[:, :-1] # background is last | ||
predictions, r_indices = model.roi_heads.box_predictor.inference( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are duplicating r_indices removed in this case? iirc it is removed in the BUTD code. Duplicate means the same proposal box corresponding to two different objects.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall looks great! Added a few comments and we are getting there 👍
|
||
| Backbone | AP<sub>50:95</sub> | Download | | ||
| -------- | ---- | -------- | | ||
| X-152-region-FPN | 5.25 | <a href="https://dl.fbaipublicfiles.com/grid-feats-vqa/region-fpn-X-152/region-fpn-X-152.pth">model</a> \| <a href="https://dl.fbaipublicfiles.com/grid-feats-vqa/region-fpn-X-152/fpn-X-152-metrics.json">metrics</a> \| <a href="https://dl.fbaipublicfiles.com/grid-feats-vqa/region-fpn-X-152/region-fpn-X-152-features_fc7.tar.gz">features (fc7)</a> <a href="https://dl.fbaipublicfiles.com/grid-feats-vqa/region-fpn-X-152/region-fpn-X-152-features.tar.gz">features (fc6)</a> | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nitpick: maybe just showing one digit after decimal point is enough here.
@@ -0,0 +1,31 @@ | |||
MODEL: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For consistency we can either use "region" for all the files, or "updn" in all the files.
"cls_prob": cls_probs.cpu().numpy(), | ||
"features": pooled_features.cpu().numpy() | ||
} | ||
np.save(os.path.join(dump_folder, str(image_id) + ".npy"), info) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
other parts lgtm, ideally we should test if given the same pre-trained model, using the previous code in mb-benchmark and the current code in D2 should generate the same results. :)
help="path to image folder dataset, if not provided expects a detection dataset", | ||
default="", | ||
) | ||
parser.add_argument( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe worth adding an option of "K" which stands for extracting the top K region features, I think this is useful for a lot of papers that choose top 36, or varying size from 36-100 (maybe another option is needed there)
features = [features[f] for f in self.in_features] | ||
box_features = self.box_pooler(features, [x.proposal_boxes for x in proposals]) | ||
fc7, fc6 = self.box_head(box_features) | ||
return box_features, fc7, fc6 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure if it is trivial for users or not, but if we are storing fc6 anyway, might be worth saving the fc7 weights at the same time so that it can be directly loaded in mmf.
This PR adds the following components :
roi_heads
for getting rois features fromAttributeStandardROIHeads
andAttributeRes5ROIHeads
build_detection_test_loader_for_images
for building a test dataset from an image folderTestDatasetMapper
class for mapping test datasets with only images