Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to extract Cast and Activity features? #5

Open
albertaparicio opened this issue Jul 14, 2020 · 19 comments
Open

How to extract Cast and Activity features? #5

albertaparicio opened this issue Jul 14, 2020 · 19 comments

Comments

@albertaparicio
Copy link

Now that there are new models uploaded to Google Drive, I am trying to process a video with all 4 modes, but I do not see how can I extract the features for the Cast and Activity modes.

Could you give me any pointers on this?

Thank you

@AnyiRao
Copy link
Owner

AnyiRao commented Jul 18, 2020

Please check out here https://github.com/movienet/movienet-tools

And we will keep on improving it 🎉

Thanks for your interest, 😄

@miaoqiz
Copy link

miaoqiz commented Aug 12, 2020

Hi,

Thanks for this excellent project!

Just wondering where "cast_feat", "act_feat", and "aud_feat" are located in the Google drive?

Running "run.sh" had a lot of errors due to the missing files under these directories. For example:


FileNotFoundError: [Errno 2] No such file or directory: '../data/scene318/act_feat/tt1375666.pkl'


Thanks so much and have a good day!

@AnyiRao
Copy link
Owner

AnyiRao commented Aug 14, 2020

Hi miaoqiz

Thanks for your interests. The features are already uploaded and you may follow the guidance https://github.com/AnyiRao/SceneSeg/blob/master/docs/INSTALL.md#prepare-datasets-for-scene318
and use wget to download.

Best,

@xpngzhng
Copy link

xpngzhng commented Sep 20, 2020

Hi @AnyiRao
How to extract cast_feat and act_feat using https://github.com/movienet/movienet-tools for a new video
the code does not seem complete

@AnyiRao
Copy link
Owner

AnyiRao commented Sep 20, 2020

Hello xpngzhng,

You need to use the scripts/dist_infer.sh to start the cast_feat and act_feat extraction scripts.

You may refer to the following place feature extraction example.

# Place feat
bash scripts/dist_infer.sh scripts/extract_place_feat.py 8 --listfile ../data/meta/frame_240P.txt --img_prefix ./ --save_path ../data/place_feat.npy —imgs_per_gpu 256

Hi @AnyiRao
How to extract cast_feat and act_feat using https://github.com/movienet/movienet-tools for a new video
the code does not seem complete

@xpngzhng
Copy link

Hi AnyiRao

I want to use SceneSeg and movienet-tools to run a video clip scene segmentation using aud_feat, place_feat, cast_feat and act_feat, so I need to split shots first and then extract all the four kinds of features.

I can extract place_feat and aud_feat using SceneSeg, since the code is available, but I still need to extract cast_feat and act_feat.
From movienet-tools I cannot find the code to extract cast_feat. As last common says, in movienet-tools, the extracted place feat of a movie is in one npy file, which is different from what SceneSeg requires.

Hello xpngzhng,

You need to use the scripts/dist_infer.sh to start the cast_feat and act_feat extraction scripts.

You may refer to the following place feature extraction example.

# Place feat
bash scripts/dist_infer.sh scripts/extract_place_feat.py 8 --listfile ../data/meta/frame_240P.txt --img_prefix ./ --save_path ../data/place_feat.npy —imgs_per_gpu 256

Hi @AnyiRao
How to extract cast_feat and act_feat using https://github.com/movienet/movienet-tools for a new video
the code does not seem complete

@AnyiRao
Copy link
Owner

AnyiRao commented Sep 21, 2020

Hello xpngzhng,

Thanks for your interest in the project. You may refer to http://docs.movienet.site/movie-toolbox/tools/extract_feature Or you could implement something like the following. Cast feature consistutie of the face feature and person body feature. The example of extract face feature is as follows. Person body feature extractor loads the ./model/resnet50_csm.pth

# init a face extractor
from movienet.tools import FaceExtractor
weight_ext = './model/irv1_vggface2.pth'
extractor = FaceExtractor(weight_ext, gpu=0)

# extractor face feature
feat = extractor.extract($IMG$) # need to specify the $IMG$

Best,

Hi AnyiRao

I want to use SceneSeg and movienet-tools to run a video clip scene segmentation using aud_feat, place_feat, cast_feat and act_feat, so I need to split shots first and then extract all the four kinds of features.

I can extract place_feat and aud_feat using SceneSeg, since the code is available, but I still need to extract cast_feat and act_feat.
From movienet-tools I cannot find the code to extract cast_feat. As last common says, in movienet-tools, the extracted place feat of a movie is in one npy file, which is different from what SceneSeg requires.

Hello xpngzhng,
You need to use the scripts/dist_infer.sh to start the cast_feat and act_feat extraction scripts.
You may refer to the following place feature extraction example.

# Place feat
bash scripts/dist_infer.sh scripts/extract_place_feat.py 8 --listfile ../data/meta/frame_240P.txt --img_prefix ./ --save_path ../data/place_feat.npy —imgs_per_gpu 256

Hi @AnyiRao
How to extract cast_feat and act_feat using https://github.com/movienet/movienet-tools for a new video
the code does not seem complete

@xpngzhng
Copy link

Thank you for your quick response
I will take a closer look at movienet-tools code and examine more detail about the pkl file required by SceneSeg

@xpngzhng
Copy link

Hi @AnyiRao
Sorry to bother you again

It is not difficult to extract cast_feat by movienet-tools, at least I can organize a pkl file with the keys and feature dim the same as scene318 dataset's movie cast_feat

But I failed to produce similar act_feat. The act_feat of one movie in scene318 dataset is like:
image
each shot has a 512 length feature
What I obtain from movienet-tools is like:
image
each shot may have a list of dict, each dict has a feat whose length is 2048, seems to be the length of fast rcnn roi feature

I hope I use the movienet-tools in a correct way. Is it possible to obtain 512-length action feature for each shot?

@AnyiRao
Copy link
Owner

AnyiRao commented Sep 21, 2020

Hi @xpngzhng

You have done a good job. What you did is correct.

After I discussed it with my collaborators, the feature size cannot match with the previous one, since we updated the backbone and the model to a better version. As we said, the project is an ongoing effort, we are iteratively making it better, and this causes the version mismatch.

And we also need to be clarified that the cast feature doesn't match with the previous one either. In the previous version, the cast feature (dim=512) is concatenated by face feature (dim=256) and body feature (dim=256). Now the dimension of the face feature is 512 and the body feature is 256. You may need to notice this.

The good news is that the release of videos goes to the final round (this Wednesday). You may extract the features from the videos then.

You are cordily to contact us via email if you have any further question as CVPR deadline is approaching. We will try our best to adapt to the purpose of your usage.

Best,

Hi @AnyiRao
Sorry to bother you again

It is not difficult to extract cast_feat by movienet-tools, at least I can organize a pkl file with the keys and feature dim the same as scene318 dataset's movie cast_feat

But I failed to produce similar act_feat. The act_feat of one movie in scene318 dataset is like:
image
each shot has a 512 length feature
What I obtain from movienet-tools is like:
image
each shot may have a list of dict, each dict has a feat whose length is 2048, seems to be the length of fast rcnn roi feature

I hope I use the movienet-tools in a correct way. Is it possible to obtain 512-length action feature for each shot?

@xpngzhng
Copy link

Hi @AnyiRao
Many thanks to you and your team

@miaoqiz
Copy link

miaoqiz commented Sep 22, 2020

Hi,

Can you kindly advise how to understand the prediction output from "python run.py ../run/xxx/xxx.py"

For example:


demo 0020 1 1
demo 0021 1 1
demo 0022 1 1


What does each column present?

Thanks so much and have a good day!

@AnyiRao
Copy link
Owner

AnyiRao commented Sep 23, 2020

Hi @miaoqiz

The function to write the output is as follows, https://github.com/AnyiRao/SceneSeg/blob/master/lgss/utilis/dataset_utilis.py#L160

And you could also find out that the template is videoid/imdbid shotid groundtruth prediction. To be exact, demo 0020 1 1 means that both the gt and pred of the boundary of 0020 and 0021 is a scene boundary.

Best,

Hi,

Can you kindly advise how to understand the prediction output from "python run.py ../run/xxx/xxx.py"

For example:

demo 0020 1 1
demo 0021 1 1
demo 0022 1 1

What does each column present?

Thanks so much and have a good day!

@miaoqiz
Copy link

miaoqiz commented Sep 23, 2020

Thanks so much! @AnyiRao

How to find the timecode / frame range of a predicted shot? for example, how to find the specifics of "0020"?

Thanks!

@AnyiRao
Copy link
Owner

AnyiRao commented Sep 23, 2020

Hi @miaoqiz

If you follow my file naming rule, shot_txt has the information about the shot and frame correspondence to recover the time of each scene. You may also refer here for scene318 https://github.com/AnyiRao/SceneSeg/blob/master/docs/INSTALL.md#explanation

Best,

Thanks so much! @AnyiRao

How to find the timecode / frame range of a predicted shot? for example, how to find the specifics of "0020"?

Thanks!

@jblemoine
Copy link

jblemoine commented Jan 13, 2021

Hi @AnyiRao ,

Thanks you for sharing your code-base.

What model did you used for face feature extraction ? I am trying to replicate your results, however as you mentioned the face feature extractor from movienet-tools has a 512 length output whereas yours is 256.

@Eniac-Xie
Copy link

Hi @AnyiRao ,

Thanks you for your released code, and when will you release raw videos? If raw videos can not be released recently, would you like to release the model which you used to extract cast_feat?

Thank you!

@onnkeat
Copy link

onnkeat commented Apr 21, 2021

Hi @AnyiRao

Appreciated for sharing the codebase of LGSS framework.
According to your reply above, the raw videos should have been released on 23 Sept 2020. May I know how to obtain the raw videos?

Thank you very much !

The good news is that the release of videos goes to the final round (this Wednesday). You may extract the features from the videos then.

@ra1995
Copy link

ra1995 commented Jun 7, 2021

Hi @AnyiRao

Many congratulations to you on the awesome work. Going through this thread, I realize that that the models for face and action feature extraction have changed. Due to this I am getting size mismatch errors for the same. I need to extract these features for a custom dataset that my team has created for scene boundary detection and also compare it with your approach.

I would be grateful if you can provide the models and scrips for face and action feature extraction that you have used. I have successfully modified the movie-net tools library to be compatible with our dataset but I am facing issues when I run the scene segmentation model (all.py config) due to the size mismatch.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants