Using Detectron2 with pre-trained VisualBERT on Hateful Memes #387

gchhablani · 2020-07-06T13:50:50Z

❓ Questions and Help

Hi MMF developers,

I have been trying to use VisualBERT as shown in this issue #291, with Detectron2 Faster RCNN (#206) features for visual embeddings.

Here is what I am trying for the features:

cfg = get_cfg()
cfg.merge_from_file(model_zoo.get_config_file("COCO-Detection/faster_rcnn_R_50_FPN_3x.yaml"))
cfg.MODEL.ROI_HEADS.SCORE_THRESH_TEST = 0.5
cfg.MODEL.WEIGHTS = model_zoo.get_checkpoint_url("COCO-Detection/faster_rcnn_R_50_FPN_3x.yaml")

## Model building
model = build_model(cfg)
model.eval() ## No fine-tuning

## Making a single image into imagelist
img = Image.open(os.path.join(data_dir,record['img'])).convert('RGB')
img = ImageList.from_tensors([(transforms.ToTensor()(transforms.Resize((448,448))(img)))])

## Extracting features from backbone
feat = model.backbone(img.tensor.to('cuda:0'))

## Generating proposals
proposals,_ = model.proposal_generator(img,feat)

## Roi Align
box_features = model.roi_heads.box_pooler([feat[f] for f in feat if f!='p6'],[p.proposal_boxes for p in proposals])

## Get features from fc2
box_features = model.roi_heads.box_head(box_features) ## FINAL Features

I find two problems :

These features have the dimensions 1024, while the pre-trained VisualBERT model requires the dimension 2048.
Detectron2 does not have pre-trained weights on Visual Genome.

Is there a way to load the model using the model path/config URL in https://github.com/facebookresearch/mmf/blob/master/tools/scripts/features/extract_features_vmb.py with detectron2? Or any way of going about this where I won't have to re-train both the networks?

Thank you,
Gunjan

The text was updated successfully, but these errors were encountered:

vedanuj · 2020-07-07T03:40:23Z

D2 default FC_DIM is 1024, the old features that Visual BERT uses are of 2048 dimension and trained using vqa_maskrcnn_benchmark
For D2 pretrained on Visual Genome check this work in progress PR : [feature] Add Region feature configs and extraction file grid-feats-vqa#3

We will share the features soon for D2.

gchhablani · 2021-05-24T10:33:54Z

Hi @vedanuj

Any updates on this?

Additionally, I would like to ask if it is possible to share the Pythia visual embeddings (the 2048 dimensional embeddings) for VQA v2 which was used by the VisualBert authors?

ankitade closed this as completed Jan 19, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Using Detectron2 with pre-trained VisualBERT on Hateful Memes #387

Using Detectron2 with pre-trained VisualBERT on Hateful Memes #387

gchhablani commented Jul 6, 2020 •

edited

Loading

vedanuj commented Jul 7, 2020

gchhablani commented May 24, 2021

Using Detectron2 with pre-trained VisualBERT on Hateful Memes #387

Using Detectron2 with pre-trained VisualBERT on Hateful Memes #387

Comments

gchhablani commented Jul 6, 2020 • edited Loading

❓ Questions and Help

vedanuj commented Jul 7, 2020

gchhablani commented May 24, 2021

gchhablani commented Jul 6, 2020 •

edited

Loading