-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Request for the negative datasets #1
Comments
Thanks for your interests. The negative dataset is approximately more than 10GB, that why we didnt choose to upload, it is just too much. You can create your own negative samples using mutations, or treating unseen enzyme-reaction pairs as negative samples, or using homology alignments. |
Thanks for your reply. I understand that the negative dataset is large (>10GB), and uploading it may not be feasible. However, to better replicate your results and ensure alignment with your experimental settings, I would like to confirm few points: |
You dont need exact negative samples to reproduce our results because our results are retrieval based, i.e., using only positive samples in evaluation. If you want to duplicate the ratio, it is 1:1000 for both sequence and molecule. |
Thanks for your reply again! |
Great work!
However, I’m a bit confused about the negative samples in your work,
even with the negative preparation codes provided.
Is the ratio of positive to negative samples set at 1:1000 for both training and testing?
Could you also provide the negative datasets as a benchmark for reproduction and comparision fairly?
The text was updated successfully, but these errors were encountered: