This is CellBox scripts developed in Sander lab.
Maintained by Bo Yuan, Judy Shen, and Augustin Luna.
If you want to discuss the usage or to report a bug, please use the 'Issues' function here on GitHub.
If you find CellBox
useful for your research, please consider citing the corresponding publication.
bioRxiv: link
For more information, please find our contact information here.
Easily try CellBox
online with Binder
- Go to: https://mybinder.org/v2/gh/dfci/CellBox/version_for_revision
- From the New dropdown, click Terminal
- Run the following command for a short example of model training process:
python scripts/main.py -config=configs/Example.random_partition.json
Alternatively, in project folder, do the same command
The following command will install cellbox from a particular branch using the '@' notation:
pip install git+https://github.com/dfci/CellBox.git@version_for_revision#egg=cellbox\&subdirectory=cellbox
Clone repository and in the cellbox
folder run:
python3.6 setup.py install
Only python3.6 supported. Anaconda or pipenv is recommended to create python environment.
Now you can test if the installation is successful
import cellbox
cellbox.VERSION
node_index.txt
: names of each protein/phenotypic node.expr_index.txt
: information each perturbation condition (also see loo_label.csv).expr.csv
: Protein expression data from RPPA for the protein nodes and phenotypic node values. Each row is a condition while each column is a node.pert.csv
: Perturbation strength and target of all perturbation conditions. Used as input for differential equations.
CellBox
is defined in model.py- A dataset factory function for random parition and leave one out tasks
- Some training util functions in tensorflow
- Make sure to specify the experiment_id and experiment_type
experiment_id
: name of the experiments, would be used to generate results foldersexperiment_type
: currently available tasks are {"random partition", "leave one out (w/o single)", "leave one out (w/ single)", "full data", "single to combo"]}
- Different training stages can be specified using
stages
andsub_stages
in config file
The experiment type configuration file is specified by --experiment_config_path
or -config
python scripts/main.py -config=configs/Example.random_partition.cfg.json
Note: always run the script in the root folder.
A random seed can also be assigned by using argument --working_index
or -i
python scripts/main.py -config=configs/Example.random_partition.cfg.json -i=1234
When training with leave-one-out validation, make sure to specify the drug index --drug_index
or -drug
to leave out from training.
- You should see a experiment folder generated under results using the date and
experiment_id
. - Under experiment folder, you would see different models run with different random seeds
- Under each model folder, you would have:
record_eval.csv
: log file with loss changes and time used.random_pos.csv
: how the data was split (only for random partitions)best.W
,best.alpha
,best.eps
: model parameters snapshot for each training stagebest.test_hat
: Prediction on test set, using the best model for each stage.ckpt
files are the final models in tensorflow compatible format.