You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have been trying to use VisualBERT as shown in this issue #291, with Detectron2 Faster RCNN (#206) features for visual embeddings.
Here is what I am trying for the features:
cfg = get_cfg()
cfg.merge_from_file(model_zoo.get_config_file("COCO-Detection/faster_rcnn_R_50_FPN_3x.yaml"))
cfg.MODEL.ROI_HEADS.SCORE_THRESH_TEST = 0.5
cfg.MODEL.WEIGHTS = model_zoo.get_checkpoint_url("COCO-Detection/faster_rcnn_R_50_FPN_3x.yaml")
## Model building
model = build_model(cfg)
model.eval() ## No fine-tuning
## Making a single image into imagelist
img = Image.open(os.path.join(data_dir,record['img'])).convert('RGB')
img = ImageList.from_tensors([(transforms.ToTensor()(transforms.Resize((448,448))(img)))])
## Extracting features from backbone
feat = model.backbone(img.tensor.to('cuda:0'))
## Generating proposals
proposals,_ = model.proposal_generator(img,feat)
## Roi Align
box_features = model.roi_heads.box_pooler([feat[f] for f in feat if f!='p6'],[p.proposal_boxes for p in proposals])
## Get features from fc2
box_features = model.roi_heads.box_head(box_features) ## FINAL Features
I find two problems :
These features have the dimensions 1024, while the pre-trained VisualBERT model requires the dimension 2048.
Detectron2 does not have pre-trained weights on Visual Genome.
Additionally, I would like to ask if it is possible to share the Pythia visual embeddings (the 2048 dimensional embeddings) for VQA v2 which was used by the VisualBert authors?
❓ Questions and Help
Hi MMF developers,
I have been trying to use VisualBERT as shown in this issue #291, with Detectron2 Faster RCNN (#206) features for visual embeddings.
Here is what I am trying for the features:
I find two problems :
Is there a way to load the model using the model path/config URL in https://github.com/facebookresearch/mmf/blob/master/tools/scripts/features/extract_features_vmb.py with detectron2? Or any way of going about this where I won't have to re-train both the networks?
Thank you,
Gunjan
The text was updated successfully, but these errors were encountered: