The section of the code below breaks for file names that contain underscores by default. It might be better to handle getting that data some other way than to split on filenames?
https://github.com/psilonpneuma/Zooniverse/blob/60b573fb901bd6c448960b781498077e4a79d4e2/create_subjects/pipeline.py#L48-L54