Name		Name	Last commit message	Last commit date
parent directory ..
scripts		scripts
.gitignore		.gitignore
README.md		README.md
anet.py		anet.py
datasets.py		datasets.py
engine_for_finetuning.py		engine_for_finetuning.py
engine_for_pretraining.py		engine_for_pretraining.py
ensemble.py		ensemble.py
functional.py		functional.py
kinetics.py		kinetics.py
mae.py		mae.py
masking_generator.py		masking_generator.py
modeling_finetune.py		modeling_finetune.py
modeling_pretrain.py		modeling_pretrain.py
optim_factory.py		optim_factory.py
rand_augment.py		rand_augment.py
random_erasing.py		random_erasing.py
run_class_finetuning.py		run_class_finetuning.py
run_class_linear.py		run_class_linear.py
run_mae_pretraining.py		run_mae_pretraining.py
run_mae_vis.py		run_mae_vis.py
ssv2.py		ssv2.py
transforms.py		transforms.py
utils.py		utils.py
video_transforms.py		video_transforms.py
vis.sh		vis.sh
vits.py		vits.py
volume_transforms.py		volume_transforms.py

README.md

VideoMAE

The code is modified from VideoMAE, and the following features have been added:

support adjusting the input resolution and number of the frames when fine-tuning (The original offical codebase only support adjusting the number of frames)
support applying repeated augmentation when pre-training

Installation

python 3.6 or higher
pytorch 1.8 or higher
timm==0.4.8/0.4.12
deepspeed==0.5.8
TensorboardX
decord
einops
opencv-python
(optional) petrel sdk (for reading the data on ceph)

ModelZoo

Backbone	Pretrain Data	Finetune Data	Epoch	#Frame	Pre-train	Fine-tune	Top-1	Top-5
ViT-B	UnlabeledHybrid	Kinetics-400	800	16 x 5 x 3	vit_b_hybrid_pt_800e.pth	vit_b_hybrid_pt_800e_k400_ft.pth	81.52	94.88
ViT-B	UnlabeledHybrid	K710*	800	16 x 5 x 3	same as above	vit_b_hybrid_pt_800e_k710_ft.pth	79.33	94.03
ViT-B	UnlabeledHybrid	Something-Something V2	800	16 x 2 x 3	same as above	vit_b_hybrid_pt_800e_ssv2_ft.pth	71.22	93.31

Note: K710 is the union of different versions of Kinetics datasets (K400, K600, K700) where their label semantics are aligned and the duplicate videos with the validation sets are removed. K710 contains 658k training videos and 67k validation videos.

Others

Please refer to VideoMAE for Data, Pretrain and Finetune sections.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

VideoMAE

VideoMAE

README.md

VideoMAE

Installation

ModelZoo

Others

Files

VideoMAE

Directory actions

More options

Directory actions

More options

Latest commit

History

VideoMAE

Folders and files

parent directory

README.md

VideoMAE

Installation

ModelZoo

Others