Skip to content

The official pytorch implementation of our paper "Is Space-Time Attention All You Need for Video Understanding?"

License

Notifications You must be signed in to change notification settings

arjunrs1/TimeSformer

 
 

Repository files navigation

Proficiency Estimation - Demonstrator Proficiency

Pretrained Checkpoints

We provide ego and exo model checkpoints trained on EgoExo4D.

| name | dataset | view | # of frames | spatial crop | acc@1 | acc@5 | url | | --- | --- | --- | --- | --- | --- | --- | | TimeSformer | EgoExo4D | ego | 16 | 448 | 79.6 | 94.0 | model | | TimeSformer | EgoExo4D | exo | 16 | 448 | 79.6 | 94.0 | model |

Installation

Follow the installation instructions for TimeSformer from the original repo: https://github.com/facebookresearch/TimeSformer

Dataset Preparation

Download the EgoExo4D proficiency estimation takes and annotations into:

../data

To generate the splits, run:

data_processing/ego_exo/generate_demonstrator_proficiency_splits.ipynb

save the generated files to: TimeSformer/ego_exo_splits

Training the ego model

Replace the PATH_TO_DATA_DIR and PATH_PREFIX in configs/EgoExo/TimeSformer_divST_16x16_448.yaml with the full path to TimeSformer/ego_exo_splits/448pFull and to the ego_exo data directory (../data), respectively.

Then, use the following command to train TimeSformer on the egocentric view:

python tools/run_net.py --cfg configs/EgoExo/TimeSformer_divST_16x16_448.yaml OUTPUT_DIR ./outputs/448pFull/ego DATA.CAMERA_VIEW ego

Training the exo model

Use the following command to train TimeSformer on the egocentric view:

python tools/run_net.py --cfg configs/EgoExo/TimeSformer_divST_16x16_448.yaml OUTPUT_DIR ./outputs/448pFull/exo_all DATA.CAMERA_VIEW exo_all

Evaluation

Use the following command for evaluation. Replace with the proper view ('ego', 'exo_all'), <PATH_PREFIX> with the relative path to the repo, and <BEST_CKPT> with the name of the checkpoint.


python tools/run_net.py --cfg configs/EgoExo/TimeSformer_divST_16x16_448.yaml OUTPUT_DIR ./outputs/448pFull/<VIEW> DATA.CAMERA_VIEW <VIEW> TEST.CHECKPOINT_FILE_PATH <PATH_PREFIX>/ProficiencyEstimation/TimeSformer/outputs/448pFull/<VIEW>/checkpoints/<BEST_CKPT>.pyth TRAIN.ENABLE False TEST.SAVE_RESULTS_PATH preds.pkl

Finetuning

To finetune from the existing checkpoint, add the following line in the command line, or in the YAML config:

TRAIN.CHECKPOINT_FILE_PATH path_to_your_PyTorch_checkpoint
TRAIN.FINETUNE True

Environment

The code was developed using python 3.8.17. For training, we used one GPU compute node containing 8 Quadro RTX 6000 GPUs.

Acknowledgements

This codebase is built on top of TimeSformer by Gedas Bertasius and Lorenzo Torresani. We thank the authors for releasing their code.

About

The official pytorch implementation of our paper "Is Space-Time Attention All You Need for Video Understanding?"

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 81.0%
  • Python 18.9%
  • Shell 0.1%