🧬🧑🏾💻 Riboclette: Conditional Deep Learning Model Reveals Translation Elongation Determinants during Amino Acid Deprivation
Welcome to Riboclette, a transformer-based deep learning model for predicting ribosome densities under various nutrient-deprivation conditions. Follow this tutorial to get started! 🚀
Riboclette can be easily installed as a package using which you can make predictions on new gene sequences, and obtain model derived attributions to understand the predictions!
🧀 Package Documentation: Riboclette on PyPI
pip install riboclette
We provide a web-based server where you can explore codon-level attributions for different genes in the dataset. This server allows you to visualize and analyze the model's predictions and interpretability results interactively.
🔗 Server Link: Ribotly
On the server, you can:
- Select genes of interest from the dataset.
- View codon-level attributions for each gene.
- Analyze how nutrient-deprivation conditions affect ribosome densities at a single codon resolution.
Download the processed data and the pre-trained model checkpoints from the following link:
After downloading:
- Place the data in the
riboclette/data/
folder. 📁 - Place the checkpoints in the
riboclette/checkpoints/
folder. ✅
To run the data pre-processing pipeline, run the following command:
cd /riboclette/preprocessing
python processing.py
Train the Riboclette model using the following command:
cd /riboclette/models/xlnet/dh
python train.py
To perform pseudolabeling, first train 5 seed models of Riboclette:
cd /riboclette/models/xlnet/dh
python train.py --seed {1, 2, 3, 4, 42}
Once all seed models are trained, generate the pseudolabeling dataset:
cd /riboclette/preprocessing
python plabeling.ipynb
Train pseudolabeling-based model using the following command:
cd /riboclette/models/xlnet/plabel
python train.ipynb
Generate codon-level interpretations for all sequences for the testing set:
cd /riboclette/models/xlnet/plabel
python LIGInterpret.py
Generate motifs derived from random windows chosen from the full dataset:
cd /riboclette/models/xlnet/plabel
python beamSearch.py
Recreate the figures from the Riboclette paper using the downstream analysis scripts provided in the repository. These scripts allow you to analyze the model outputs and generate the figures mentioned in the paper.
-
Navigate to the downstream analysis folder:
cd /riboclette/downstream_analysis
-
Run the analysis notebooks to generate the respective figures:
python figure{2,3,4,5}.py
-
The generated figures will be saved in the
riboclette/data/results/figures/
folder. 🖼️
🎉 You're all set! Follow these steps to fully utilize Riboclette for ribosome density prediction, interpretability, and downstream analysis. 🚀