The directory java contains source files for the AnyBURL algorithm with the original license preserved. Some changes have been applied to the source to address bugs and other small issues.
The java code must be compiled before running the example scripts here, to compile the code ensure that you have Java installed (we recommend OpenJDK v14 and Apache Maven v3.9.4).
To compile the java code;
$ make build-java
We recommend installing the code in a python virtualenv. Once you have a virtualenv set up, the code can be installed with
pip install -e '.[dev]'
and tests can be run with tox
tox
and linters can be run with the shortcut
make lint
Train AnyBURL using the graph provided in data/triples.txt
anyburl-train data
results are written to a results
directory.
Generate rules from the trained AnyBURL artefacts in results
anyburl-predict data
further results are written to the results
directory.
Run this command to create a list of evidence chains for "Parkinson disease" treatments (defined in
data/parkinson_disease_predicted_treatments.txt), the output is written to evidence-chains.jsonl
.
healx-chains \
data \
results \
results/predict-1000 \
results/predict-explanation \
--predictions-filter-file data/parkinson-disease-filter/predictions.txt \
--explanations-filter-file data/parkinson-disease-filter/prioritised-edge-types.txt
Results are written to an evidence chains file results/evidence-chains.jsonl
in the JSONL format. Each line contains
a generated "chain" prediction with the following example structure
{
"prediction": "methixene_COMPOUND",
"prediction_score": 0.08924107199497017,
"start_node": "methixene_COMPOUND",
"end_node": "parkinson_disease_DISEASE",
"metapath": [
{
"label": "COMPOUND_inhibits_GENE",
"reversed": false
},
{
"label": "DISEASE_associates_GENE",
"reversed": true
}
],
"path_score": 0.005949453701297636,
"path": [
"methixene_COMPOUND",
"htr2c_GENE",
"parkinson_disease_DISEASE"
]
}
we can expect to find thousands of chains in this file.
Run this command (after creating the evidence chains file) to filter chains by gene and pathway for "Parkinson disease".
healx-filter \
results/evidence-chains.jsonl \
data/parkinson-disease-filter/genes.txt \
data/parkinson-disease-filter/pathways.txt \
data/parkinson-disease-filter/predictions-short-list.txt \
data/parkinson-disease-filter/prioritised-edge-types.txt \
--filtered-evidence-chains-file filtered-evidence-chains.txt
which will produce a filtered evidence chains file in text format, filtered-evidence-chains.txt
- one line per text
chain. An example of one of the lines from this file are
AMANTADINE inhibits CHRNA3 participates NEUROACTIVE_LIGAND-RECEPTOR_INTERACTION involves CABERGOLINE in_trial_for PARKINSON_DISEASE
.