Hi,
I’m trying to replicate the experiments and have successfully downloaded all the datasets except for the SBU Caption dataset.
The SBU Captioned Dataset is described as a collection of 1 million image URLs with associated captions sourced from Flickr.
Unfortunately:
The download link provided in this repo has expired.
The dataset's official website (https://www.cs.rice.edu/~vo9/sbucaptions/) is still online, but the image URLs listed there (e.g., http://static.flickr.com/2723/4385058960_b0f291553e.jpg) are no longer accessible—most return 403 errors or are permanently unavailable due to image deletions or access restrictions on Flickr.
I would like to ask:
How did the authors download this dataset for Stage-2 multi-task fine-tuning in your recent experiments?
Would it be possible to share a copy of the preprocessed dataset (images) in a ZIP or other archive format to facilitate reproducibility?
Thank you very much for your help!
Hi,
I’m trying to replicate the experiments and have successfully downloaded all the datasets except for the SBU Caption dataset.
The SBU Captioned Dataset is described as a collection of 1 million image URLs with associated captions sourced from Flickr.
Unfortunately:
The download link provided in this repo has expired.
The dataset's official website (https://www.cs.rice.edu/~vo9/sbucaptions/) is still online, but the image URLs listed there (e.g., http://static.flickr.com/2723/4385058960_b0f291553e.jpg) are no longer accessible—most return 403 errors or are permanently unavailable due to image deletions or access restrictions on Flickr.
I would like to ask:
How did the authors download this dataset for Stage-2 multi-task fine-tuning in your recent experiments?
Would it be possible to share a copy of the preprocessed dataset (images) in a ZIP or other archive format to facilitate reproducibility?
Thank you very much for your help!