Skip to content
This repository was archived by the owner on May 13, 2025. It is now read-only.

Latest commit

 

History

History
38 lines (33 loc) · 3.69 KB

File metadata and controls

38 lines (33 loc) · 3.69 KB

Data Preprocessing

Our preprocessing pipeline is crytalized in VideoPreprocessor module, and it contains two different stages. We create scripts for both Linux and Windows OS, but below is an exposition when running on Linux.

  • Stage 1
    Resampling video with specified FPS → Rescale → Central crop -> Extract video stream (script).
  • Stage 2:
    Split video into n non-overlapping video segments (clips) -> Temporal sampling -> Save as Pytorch tensor (.pt format) (script).
  • Stage 3:
    Copy remaining files to preprocessed directory (script).

Common configurations

Parameter Default Description
device

"cpu"

Device for preprocessing video with ffmpeg. Note that ffmpeg should be built with GPU acceleration.
cpu_ratio

0.5

Ratio b/t the utilization of cpu and gpu if device is both.
save_root

"preprocessed"

Output root of preprocessed videos.
root

--

Root of dataset, which is read by VideoFolderDataset class.
loader

"v6"

Video loader api (check this).
batch_size

48

Dataloader batch size.
processes

os.cpu_count() // 2

Number of processes for multiprocessing.
fn_name

--

Which preprocessing stage to run.

Stage 1 only

Parameter Default Description
run_async

False

Waiting time when running in an async manner.
wait_time

20

Run ffmpeg in an async manner.

Stage 2 only

Parameter Default Description
del_prev_result

False

Delete previous stage result.
include_labeled

``

Also split labeled video into segements that will be used for train/ val later on.

Stage 3 only

Parameter Default Description
vid_ext

--

List of video extension string for being ignored when moving remaining files.