Support a sampling strategy for multiple training datasets #107
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Proposing the debiased sampling method proposed in the ZeroVL paper. When training multiple datasets, the debiased sampling improves the accuracy of CLIP model. It includes a new flag:
Introduction of Debiased Sampling
As shown in Fig2, random sampling is the most intuitive sampling method, which randomly constructs training batches with all available data. However, as shown in Fig3, random sampling leads to biased feature distributions on both image and text modalities.
Debiased sampling ensures instances within each batch come from the same dataset. Training with debiased sampling improves the quality of learned representations, and contributes to better results on many downstream tasks.
Experiments on sampling methods
We use two datasets, CC3M and SBU, to show the improvements of debiased sampling.
Experiment1: Random Sampling
1. Setting & Acc
dataset: CC3M + SBU (2.79M + 0.86M)
batchsize: 2048 (256 per GPU, 8 V100 32GB)
learning rate: 1e-3
weight decay: 0.1
sampling: random
zero-shot acc on ImageNet: top1 21.36, top5 40.98
2. Training script
'/data/cc3m/cc3m_sbu_train_anno.csv' contains all samples from CC3M and SBU.
3. Log
cc3m+sbu+random_sample.log
Experiment2: Debiased Sampling
1. Setting
dataset: CC3M + SBU (2.79M + 0.86M)
batchsize: 2048 (256 per GPU, 8 V100 32GB)
learning rate: 1e-3
weight decay: 0.1
sampling: debias
zero-shot acc on ImageNet: top1 22.33, top5 42.29
2. Training script
3. Log
cc3m+sbu+debias_sample.log