-
Notifications
You must be signed in to change notification settings - Fork 3.1k
Closed
Description
Describe the bug
Hello,
I would like to upload a video dataset (some .mp4 files and some segments within them), i.e. rows correspond to subsequences from videos. Videos might be referenced by several rows.
I created a dataset locally and it references the videos and the video readers can read them correctly. I use push_to_hub() to upload the dataset to the hub.
Expectation: A user uses load_dataset and can load the videos.
However, the videos seem to be just referenced via paths on the computer and not uploaded to the hub. Therefore a target user cannot load the videos in the dataset.
Steps to reproduce the bug
- create a video dataset with paths e.g. { ["videos"]: [path1, path2, ...]}
- dataset.push_to_hub
- on a different computer (or same pc if relative paths are used in a different folder):
dataset = load_dataset("siplab/egosim", split="train")
video = dataset[0]["video_head"]
- will fail
Expected behavior
Expectation: A user uses load_dataset and can load the videos.
Environment info
datasets 3.1.0
Python 3.8.18
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels