Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question on Reproducibility and Hyperparameter Settings #3

Open
zw-SIMM opened this issue Nov 2, 2024 · 0 comments
Open

Question on Reproducibility and Hyperparameter Settings #3

zw-SIMM opened this issue Nov 2, 2024 · 0 comments

Comments

@zw-SIMM
Copy link

zw-SIMM commented Nov 2, 2024

I am working on reproducing the results from the paper using the codebase and default settings in the repository, but I am having difficulty achieving the reported performance levels. Specifically, I used a 1:1000 ratio of negative samples for both molecules and sequences in training. Despite trying different scripts (train.py, train_contra.py, train_rnn.py, and train_tfmr.py), the best Top-1-N accuracy I obtained is only around 0.10-0.20, which is significantly lower than the reported results (~0.3).

For each model, I used the following parameters provided in your repository:

--mol_embedding_type unimol --pro_embedding_type esm --batch_size 1000

Questions:
Did you try specific hyperparameters for each model (e.g., train.py, train_contra.py, train_rnn.py, train_tfmr.py)?
Could you please provide the detailed hyperparameter settings (e.g., learning rate, batch size, number of epochs, optimizer parameters) that were used to achieve the reported results?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant