Add beam search and fix generate #392

gpucce · 2023-01-29T17:24:24Z

This should do the above and partly address #390

not ready yet

* initial setup * add coca loss * remove loss from the model * fix loss * add underscores * name changes * add cross attention to Residual and CustomResidual * fix if * ädd transformer 'decoder' * minor fix * looks better * initlize coca model structure * clean * typo and format * checkpoint signature * adjust multimodal decoder and add CoCaTransformer * keep older logic * remove chunk * typo * fix * make chunk dim explicit * adjust cfg names * add attentionalpooling * add attentional pooling to coca * small change * add cocatransformer variants and AttentionPooling * remoive older attention pooler * adapt embed text to coca text transformer * rm coca layers * rename and remove useless CoCa models * make attentionpooler pooler only * refactor for one transformer only * coca forward works * separatae context and n_queries * add inital coca_base config * remove config * small loss change * init training file * make variable order right * remove print * uniform names * renaming * add coca funcs to init * add coca config and exclude from testing * add and comment simple test (no trained model) * add L2 norm * make L2 same as in clip * remove unused temperature * type * clean * fix config * make rename and move cfg * rename * temptative add coca to factory * fix config * update config * embed contrastive cls token in model * remove unused arg * import create_loss * make factory accept coca * make caption loss distributed * make loss customizable * pass loss trhough training_epoch * add coca specific params to params * removed decoder unused parameters * remove unused attributes * adjust coca_config * fix config and remove unused parameters * remove comment * remove more comments * rename attention pooler * rename TransformerDecoder * make AttentionalPooler clearer * add local loss logic to cocaloss * only create loss if train in data * remove wrong file * fix attentional pooler call * not ready for testing * really not ready for testing * eof lien * uniform names * add possible generative loss to evaluate * change _build function names * remove wrong import * remove local_loss from captioning loss * indexing error * finish renaming * adjust configs * add training test for coca * simplify captioning loss * remove hf * fix evaluate and loss * remove print * move projection * add coca vit 32 config * test on new config * adjust coca_base config * remove coca from test_inference * maybe fix regression test * make logits and labels contiguous * simpler logic * make contiguous after transpose * last test * try fix loss * CoCa PR: loss fix + rename file * wait for feedback on this * cleanup * CoCa PR: add set_grad_checkpointing + fix checkpoint API * CoCa PR: fix eval (which uses encode_x instead of forward) * move making space for CLS token into encode_text * rever zs changes + fix Co-authored-by: gpucce <[email protected]> Co-authored-by: gpucce <[email protected]> Co-authored-by: iejmac <[email protected]>

Co-authored-by: Romain Beaumont <[email protected]>

* buil_cls_mask * add cls_mask to encode_text * add model properties Co-authored-by: Romain Beaumont <[email protected]> Co-authored-by: gpucce <[email protected]>

* add ignore_index * just need to pick right index Co-authored-by: gpucce <[email protected]>

* add initial generative support * make generation context_length independend * remove kwargs * last positional embeddings for CLS * typo * fix mask len * add comment * remove unused args * simpler logic for input shorter than context length Co-authored-by: gpucce <[email protected]>

* use self.text in encode image * unused var * rever aAtention and CustoResidualAttentionBlock * remove whiteline * add dict output * bintegrate self.text attributes * HF compatibility * better config and minor fixes * clean * remove eembed_cls option from HF * use cls_token_position * fix cls masking * resize labels * text -> self.text * split loss logging * add total loss * minor logs formatting * fix generate * simpler logic * disentangle proj for HF too * adjust config * only norm cls * move attn_pool to VisionTransformer * adjust coca_base config * fix grad checkpointing in MultimodalTransformer Co-authored-by: gpucce <[email protected]> Co-authored-by: iejMac <[email protected]>

Simpler beam search for now

* make jit compilable * redundant annotation * less tests * less annotations * even less annotations * fix name check in ci * some annotations back * make it simpler * make hf simpler too * better jit support with tests * remove extra line * add customtextclip * more jit tests * missing assert * add eval * typo * rever forward changes * clean coca model * more cleaning * last cleaning

* add README * multimodal_cfg info * multimodal

* remove output_dict argument * cleaner

* do same thing for _encode_image * encoder * try this * adjust inference tests * fix syntax * True not None * dumb

This reverts commit de343fb.

…o soo_coca

… minor formatting tweaks

rom1504 and others added 30 commits December 20, 2022 23:17

Add coca to CI

29fa332

Add coca to CI pr

911c737

simplify encode_iamge (mlfoundations#313)

b4881bc

Co-authored-by: Romain Beaumont <[email protected]>

Add cls mask (mlfoundations#312)

50bc599

* buil_cls_mask * add cls_mask to encode_text * add model properties Co-authored-by: Romain Beaumont <[email protected]> Co-authored-by: gpucce <[email protected]>

Ignore pad tokens in captioning loss (mlfoundations#316)

279e088

* add ignore_index * just need to pick right index Co-authored-by: gpucce <[email protected]>

Merge branch 'main' into coca

f616050

add beam search,fix image_latent,image_emb calculated once

9e37e4c

fix to using forward with output_dict

24167f7

fix typo image_emb to image_embs

7879799

fix import error

fbdae97

add past_key_values to loss

283523a

remove HF transformer version dependencies

6ff339f

Get some basic PEP changes out of the way

061482b

remove changes related to past_key_values

f640696

repeat_interleave

e0910f2

return full sequences

1a8245f

much much simpler

dc8c128

Merge pull request #1 from gpucce/beamsearch

d0469ee

Simpler beam search for now

Merge branch 'main' into coca

ef80b7b

train.py: fix is_clip when doing distributed (mlfoundations#364)

2ab47b7

add README (mlfoundations#365)

c0e5950

* add README * multimodal_cfg info * multimodal

Merge branch 'main' into coca

9ab881e

remove output_dict argument (mlfoundations#368)

3f5b0fb

* remove output_dict argument * cleaner

do same thing for _encode_image (mlfoundations#366)

de343fb

* do same thing for _encode_image * encoder * try this * adjust inference tests * fix syntax * True not None * dumb

Merge branch 'coca' into coca

db18668

remove output_tokens in forward

db68298

Soonhwan-Kwon and others added 21 commits January 23, 2023 09:44

fix output_dict in forward to be in model, for clip consistency

b67fc90

CoCa/forward: remove unused output_dict param

88aa6ce

Merge branch 'coca' into coca

11c80ed

Revert "do same thing for _encode_image (mlfoundations#366)"

3b66f37

This reverts commit de343fb.

refactor

cdb91dd

white space

58eb5bd

remove extra layer norm

cbd66ed

move to_logits into decoder

bf6ef3e

leave for later

03dfeab

better torchscript

15d6223

annotate hf too

9beb0d4

Add CoCa-ViT-L/14 config (mlfoundations#379)

fde2aee

Merge branch 'coca' of https://github.com/Soonhwan-Kwon/open_clip int…

9408f16

…o soo_coca

adjust to coca changes

6180441

add_cls option

9670540

cleaner

8879326

Merge branch 'main' into coca

24e454d

Remove dead LN code, refactor attn_pool conditional for more clarity,…

f7c566b

… minor formatting tweaks

Merge branch 'coca' into soo_coca

0bb6389

Merge branch 'main' into soo_coca

ab481f0

fixes

ac91008

gpucce closed this Jan 29, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add beam search and fix generate #392

Add beam search and fix generate #392

gpucce commented Jan 29, 2023

Add beam search and fix generate #392

Add beam search and fix generate #392

Conversation

gpucce commented Jan 29, 2023