|
1 |
| -# TCOSA_NO_POR5 |
| 1 | +# TCOSA - Thermodynamic Cofactor Swapping Analysis |
| 2 | + |
| 3 | +## Introduction |
| 4 | + |
| 5 | +This repository includes all scripts which were created and used for [TCOSA's publication](#the-tcosa-publication). These scripts generate all shown and discussed data as well as all graphical results figures, whereby all of these results can be also found in this repository. The next chapters explain how to install and use this repository once it is cloned or directly downloaded from GitHub. |
| 6 | + |
| 7 | +## Installation |
| 8 | + |
| 9 | +### Prerequisites |
| 10 | + |
| 11 | +1. The TCOSA package is written in Python, version 3.8, and uses an Anaconda environment for its distribution. Therefore, if you haven't done it already on your device, you have to install the free Anaconda Python distribution. You can download and install the full version from [here](https://www.anaconda.com/) or, alternatively, you can also install a smaller version called miniconda from [here](https://docs.conda.io/en/latest/miniconda.html). *Note:* Make sure that you register Anaconda in your operating system so that you system's console can access it. |
| 12 | +2. In addition to Anaconda, TCOSA is specifically programmed to use the IBM CPLEX solver in a version >= 12.10. With an academic license, you can obtain it for free from [here](https://www.ibm.com/de-de/products/ilog-cplex-optimization-studio). In order to make IBM CPLEX work with your Python distribution, follow the instructions [here](https://www.ibm.com/docs/en/icos/22.1.0?topic=cplex-setting-up-python-api). |
| 13 | + |
| 14 | +### Installation steps |
| 15 | + |
| 16 | +1. (optional, but recommended) To solve potential package version problems, set a systems variable called "PYTHONNOUSERSITE" to the value "True". How you can do this depends on you computer's operating system: |
| 17 | + |
| 18 | +* Under Windows, you can do this by searching for your system's "environmental variables" and adding the variable PYTHONNOUSERSITE with the value True using Window's environmental variables setting window. |
| 19 | +* Under Linux and MacOS, you can do this with the following console command: |
| 20 | + |
| 21 | +```sh |
| 22 | +export PYTHONNOUSERSITE=True |
| 23 | +``` |
| 24 | + |
| 25 | +2. (only necessary if you've already installed the TCOSA environment and you wish to reinstall it again) Delete the old TCOSA environment with the following console command: |
| 26 | + |
| 27 | +```sh |
| 28 | +conda env remove -n tcosa |
| 29 | +``` |
| 30 | + |
| 31 | +3. Add the IBM CPLEX cobra channel to Anaconda: |
| 32 | + |
| 33 | +```sh |
| 34 | +conda config --add channels IBMDecisionOptimization |
| 35 | +``` |
| 36 | + |
| 37 | +4. Install the actual TCOSA Python environment using the following two commands, whereby the console has to be in the main folder of your clone of this repository here: |
| 38 | + |
| 39 | +```sh |
| 40 | +conda env create -n tcosa -f environment.yml |
| 41 | +pip install ray |
| 42 | +``` |
| 43 | + |
| 44 | +### Expected install time |
| 45 | + |
| 46 | +Depending on your computer, this installation procedure typically takes from around 5 up to 30 minutes. |
| 47 | + |
| 48 | +## Reproduction of TCOSA publication results |
| 49 | + |
| 50 | +### Used hardware and software versions for the publication |
| 51 | + |
| 52 | +For the publication, CPLEX 12.10 was used on a computer cluster node with a 16-core Intel Xeon Silver 3110 CPU as well as 192 GB DDR4 RAM. |
| 53 | + |
| 54 | +### How-to reproduce the data |
| 55 | + |
| 56 | +In order to re-run all analyses and figure generations performed in this publication, first delete the "cosa" subfolder (which serves as a cache and storage for pre-calculated solutions) and then run "tcosa_full_run.py" with the TCOSA conda environment, i.e.: |
| 57 | + |
| 58 | +```sh |
| 59 | +conda activate tcosa |
| 60 | +python tcosa_full_run.py |
| 61 | +``` |
| 62 | + |
| 63 | +### Reproduction output |
| 64 | + |
| 65 | +In the end, you should get a "cosa" folder which contains the same data as the current one. Variations in the data should be only possible if you use a different CPLEX version or much slower or faster hardware which might introduce or resolve some CPLEX timeout computation abortions. |
| 66 | + |
| 67 | +### Expected run time |
| 68 | + |
| 69 | +With the settings as given in the TCOSA publication, all calculations may take at least 6 days on a typical household computer. |
| 70 | + |
| 71 | +The used scripts and the structure of the generated results are explained in the next section. |
| 72 | + |
| 73 | +## Structure of repository |
| 74 | + |
| 75 | +### Folders |
| 76 | + |
| 77 | +The subfolders of this repository have the following meaning: |
| 78 | + |
| 79 | +* "cosa": This includes all results of the actual TCOSA calculations. In the folder itself, you can find the lists of original NAD and NADP reactions as JSON. In addition, you can also find the ready-to-use iML_TCOSA and iML_TCOSA models in the SBML format as well as a pickle format. The subfolders all start with "results_" followed by the tested conditions, i.e., either "_aerobic" or "_anaerobic" and, depending on if "_expanded" is added or not, whether the expanded model was used or not. All these results folders include CSV tables with all SubMDF or OptMDF (called "mmdf") results and the used reproducible (through the usage fo a seed) random distributions. The, again, included folder named "figures" includes all generated graphical results figures for the publication. The other folder, "runs", includes zipped JSON files which contain the full flux distribution and other [OptMDFpathway](https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1006492) variables results for each calculated run. Regarding the suffixes, "FREECONC" stands for the standard OptMDFpathway concentrations and "VIVOCONC" for the concentration ranges adapted from [(Bennett et al., 2009)](https://www.nature.com/articles/nchembio.186). |
| 80 | +* "resources": This folder includes the [eQuilibrator](https://gitlab.com/equilibrator/equilibrator-api)-calculated ΔG° values as well as [*i*ML1515](https://pubmed.ncbi.nlm.nih.gov/29020004/) which was downloaded from [its corresponding BiGG website](http://bigg.ucsd.edu/models/iML1515). Furthermore, the raw data and results of the preparation of *in vivo* concentrations from [(Bennett et al., 2009)](https://www.nature.com/articles/nchembio.186) are also included. |
| 81 | + |
| 82 | +### Scripts |
| 83 | + |
| 84 | +The category of a script depends on its prefix: |
| 85 | + |
| 86 | +* "model_(...).py": These scripts include the loading and conversion of iML1515 into a more usable format, but still without its TCOSA additions. In addition, the [eQuilibrator](https://gitlab.com/equilibrator/equilibrator-api) ΔG° calculations and the preparation of *in vivo* concentrations from [(Bennett et al., 2009)](https://www.nature.com/articles/nchembio.186) are also included. |
| 87 | +* "cosa_(...).py": These are all scripts - except of the full run file "tcosa_full_run.py" - which directly include the TCOSA model changes, TCOSA analyses and generated publication figures. They do not include basic methods such as OptMDFpathway. |
| 88 | +* "test_(...).py": This one script includes a test of the OptMDFpathway routine used herein with the [ASTHERISC](https://github.com/klamt-lab/astheriscPackage) toy model. |
| 89 | +* None of these prefixes: These scripts include basic methods such as FBA, a Python implementation of [OptMDFpathway](https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1006492) with all extensions used in TCOSA's publication and the CPLEX interface which are all using [pulp](https://github.com/coin-or/pulp). |
| 90 | + |
| 91 | +## The TCOSA publication |
| 92 | + |
| 93 | +* Bekiaris & Klamt (2023), *in submission*. |
0 commit comments