Releases: mlfoundations/open_clip
Pretrained Weights
This release tag is being used to host weights for various models trained with this codebase.
NOTE: The one included metric, zero-shot top-1 on ImageNet-1k does capture the full characteristics of the given pretrained weights. Evaluation on a broader set of zero-shot and validation tasks is required for a full comparison.
model | dataset | weights | In1k zero-shot top-1 |
---|---|---|---|
RN50 | CC12M | rn50-quickgelu-cc12m | 36.45 |
RN50 | YFCC15M | rn50-quickgelu-yfcc15m | 32.73 |
RN101 | YFCC15M | rn101-quickgelu-yfcc15m | 34.86 |
ViT-B-32 | LAION-400M | vit_b_32-quickgelu-laion400m_e31 | 62.96 |
ViT-B-32 | LAION-400M | vit_b_32-quickgelu-laion400m_e32 | 62.94 |
ViT-B-32 | LAION-2B | vit_b_32-laion2b_e16 | 65.62 |
ViT-B-16 | LAION-400M | vit_b_16-laion400m_e31 | 66.98 |
ViT-B-16 | LAION-400M | vit_b_16-laion400m_e32 | 67.07 |
ViT-B-16-plus-240 | LAION-400M | vit_b_16-laion400m_e31 | 69.06 |
ViT-B-16-plus-240 | LAION-400M | vit_b_16-laion400m_e32 | 69.21 |
ViT-L-14 | LAION-400M | vit_b_14-laion400m_e31 | 72.70 |
ViT-L-14 | LAION-400M | vit_b_14-laion400m_e32 | 72.77 |
Initial release.
Welcome to the initial release of open_clip, an open source implementation of OpenAI's CLIP (Contrastive Language-Image Pre-training).
The goal of this repository is to enable training models with contrastive image-text supervision, and to investigate their properties such as robustness to distribution shift. Our starting point is an implementation of CLIP that matches the accuracy of the original CLIP models when trained on the same dataset.