Migrate train_bi-encoder_mnrl.py from v2 to v3#3634
Migrate train_bi-encoder_mnrl.py from v2 to v3#3634ritoban23 wants to merge 1 commit intohuggingface:mainfrom
Conversation
|
I'm afraid this won't work very nicely. The
|
|
@tomaarsen I'm thinking to:
Question: the original MSMARCODataset class rotates through multiple positives/negatives per query across batches (using pop/append). Should I:
|
|
I think you've effectively found what made this file so hard to upgrade, the MSMARCODataset pos/neg rotations. I think if we just take the first pos/neg and run 10 epochs like the script currently does, then we'll likely have worse performance than the old script. The
I think something like that could work, what are your impressions?
|
Part of #3621
Changes
examples/sentence_transformer/training/ms_marco/train_bi-encoder_mnrl.pyfrom v2 to v3model.fit()withSentenceTransformerTrainerandSentenceTransformerTrainingArguments