lance-deeplearning-recipes/examples/flickr8k-dataset at main · lancedb/lance-deeplearning-recipes

History

Name		Name	Last commit message	Last commit date
parent directory ..
README.md		README.md
flickr8k-dataset.ipynb		flickr8k-dataset.ipynb

README.md

Creating Image Captioning Dataset for Multi-Modal Model Training

Overview

In this example, we will be creating an Image-caption pair dataset for Multi-modal model training by using the Flickr8k_dataset and saving it in form of a Lance dataset with image file names, all captions for every image (order preserved) and the image itself (in binary format).

Flickr8k is a new benchmark collection for sentence-based image description and search, consisting of 8,000 images that are each paired with five different captions which provide clear descriptions of the salient entities and events. The images were chosen from six different Flickr groups, and tend not to contain any well-known people or locations, but were manually selected to depict a variety of scenes and situations

Code and Blog

Below is the link for the Google Colab walkthrough.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

flickr8k-dataset

flickr8k-dataset

README.md

Creating Image Captioning Dataset for Multi-Modal Model Training

Overview

Code and Blog

Files

flickr8k-dataset

Directory actions

More options

Directory actions

More options

Latest commit

History

flickr8k-dataset

Folders and files

parent directory

README.md

Creating Image Captioning Dataset for Multi-Modal Model Training

Overview

Code and Blog