./download_example_data.shThis script will download the following files for training, mutation prediction, and sequence generation:
datasets/sequences/BLAT_ECOLX_1_b0.5_lc_weights.fadatasets/nanobodies/Manglik_filt_seq_id80_id90.fadatasets/nanobodies/Manglik_labelled_nanobodies.txtcalc_logprobs/input/BLAT_ECOLX_r24-286_Ranganathan2015.fasess/BLAT_ECOLX_v2_channels-48_rseed-11_19Aug16_0626PM.ckpt-250000*sess/nanobody.ckpt-250000*
./download_sequences.shThis script will download all training sequences from the paper
to datasets/sequences/ and datasets/nanobodies/
./download_effect_predictions.shThis script will download all mutation effect predictions
from the paper to calc_logprobs/output/
./download_generated_nanobodies.shThis script will download the designed nanobody library to generated/
./demo_train.shThis script will run 100 training iterations on the β-Lactamase sequence dataset
(the full model runs 250,000 iterations).
The final model checkpoint will appear as three files in
sess/BLAT_ECOLX_elu_channels-48_rseed-11_<timestamp>.ckpt-100*
On an AWS p2.xlarge instance, this demonstration took 6 minutes.
./demo_calc_logprobs.shThis script will use the pretrained model weights in
sess/BLAT_ECOLX_v2_channels-48_rseed-11_19Aug16_0626PM.ckpt-250000*
to make mutation effect predictions for the β-Lactamase mutational scan from
Stiffler et al., Cell, 2015.
The final predictions are the average of 10 predictions
(500 are used in the full test).
These predictions will appear in
calc_logprobs/output/demo_BLAT_ECOLX_r24-286_Ranganathan2015_rseed-11_channels-48_dropoutp-0.5.csv
On an AWS p2.xlarge instance, this demonstration took 7 minutes.
./demo_generate.shThis will generate nanobody CDR3 and FRA4 sequences given a preceding VH sequence.
The full nanobody sequences will be output in
generated/nanobody.ckpt-250000_temp-1.0_rseed-42.fa
On an AWS p2.xlarge instance, this demonstration took 6 minutes.