Predicting the disease risk of protein mutation sequences with BERT pre-training model
Here are the data and code we used for the project of predicting protein mutation sequences.
(1) The 'model' directory contains the codes of BERT we pretrained for BRCA1 gene: Pre-trained models are in the '/model/protein-bert/brct_out/' folder;
(2) The 'protein-embedding' directory contains data and codes for predicting mutation sequences
If you have any questions about the code, please contact us. Thanks~ :)