Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 11 additions & 0 deletions docker-wrappers/TieDIE/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
FROM python:2.7.15

WORKDIR /TieDIE

COPY requirements.txt .
RUN pip install -r requirements.txt && \
commit=c64ab5c4b4e0f6cfac4b5151c7d9f1d7ea331e65 && \
wget https://github.com/Reed-CompBio/TieDIE/tarball/$commit && \
tar -zxvf $commit && \
rm $commit && \
mv Reed-CompBio-TieDIE-*/* .
18 changes: 18 additions & 0 deletions docker-wrappers/TieDIE/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
# TieDIE Docker image

A Docker image for [TieDIE](https://github.com/Reed-CompBio/TieDIE) that is available on [DockerHub](https://hub.docker.com/r/reedcompbio/tiedie).

To create the Docker image run:
```
docker build -t reedcompbio/tiedie -f Dockerfile .
```
from this directory.

## Testing
Test code is located in `test/TieDIE`.
The `input` subdirectory contains test files `pathway.txt`, `target.txt` and `source.txt`.
The Docker wrapper can be tested with `pytest` or a unit test with `pytest -k test_tiedie.py`.

## Versions

- `v1`: Initial version
3 changes: 3 additions & 0 deletions docker-wrappers/TieDIE/requirements.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
networkx==1.11
numpy==1.11.3
scipy==0.18.1
16 changes: 16 additions & 0 deletions docs/prms/tiedie.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
TieDIE
======

TieDIE is a pathway reconstruction algorithm which TODO.
See the `original paper <https://doi.org/10.1093/bioinformatics/btt471>`_ and SPRAS's fork of the codebase:
https://github.com/Reed-CompBio/TieDIE.

TieDIE takes several optional parameters:

* s: (int, default 1) Network size control factor
* d_expr: List of significantly differentially expressed genes, along with log-FC or FC values (i.e. by edgeR for RNA-Seq or SAM for microarray data. Generated by a sample-dichotomy of interest.)
* a: (int) Linker Cutoff (overrides the Size factor)
* c: (int, default 3) Search depth for causal paths
* p: (int, default 1000) Number of random permutations performed for significance analysis
* pagerank: (boolean, default false) uses PageRank for diffusion
* all_paths: (boolean, default false) Use all paths instead of only causal paths
2 changes: 2 additions & 0 deletions spras/runner.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@
from spras.responsenet import ResponseNet
from spras.rwr import RWR
from spras.strwr import ST_RWR
from spras.tiedie import TieDIE

algorithms: dict[str, type[PRM]] = {
"allpairs": AllPairs,
Expand All @@ -27,6 +28,7 @@
"responsenet": ResponseNet,
"rwr": RWR,
"strwr": ST_RWR,
"tiedie": TieDIE
}

def get_algorithm(algorithm: str) -> type[PRM]:
Expand Down
154 changes: 154 additions & 0 deletions spras/tiedie.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,154 @@
from pathlib import Path

import pandas as pd

from spras.config.container_schema import ProcessedContainerSettings
from spras.containers import prepare_volume, run_container_and_log
from spras.interactome import (
convert_directed_to_undirected,
reinsert_direction_col_undirected,
)
from spras.prm import PRM
from spras.util import add_rank_column, duplicate_edges, raw_pathway_df

__all__ = ["TieDIE"]

class TieDIE(PRM):
# we need edges (weighted), source set (with prizes), and target set (with prizes).
required_inputs = ["edges", "sources", "targets"]
dois = ["10.1093/bioinformatics/btt471"]

@staticmethod
def generate_inputs(data, filename_map):
"""
Access fields from the dataset and write the required input files
@param data: dataset
@param filename_map: a dict mapping file types in the required_inputs to the filename for that type
"""
# ensures the required input are within the filename_map
for input_type in TieDIE.required_inputs:
if input_type not in filename_map:
raise ValueError(f"{input_type} filename is missing")

# will take the sources and write them to files, and repeats with targets
for node_type in ["sources", "targets"]:
nodes = data.get_node_columns([node_type])
# check if the nodes have prizes or not
if data.contains_node_columns("prize"):
node_df = data.get_node_columns(["prize"])
nodes = pd.merge(nodes, node_df, on="NODEID")
nodes["sign"] = "+"
# creates with the node type without headers
nodes.to_csv(filename_map[node_type],index=False,sep="\t",columns=["NODEID", "prize", "sign"],header=False)
else:
# If there aren't prizes but are sources and targets, make prizes based on them
nodes = data.get_node_columns([node_type])
# make all nodes have a prize of 1
nodes["prize"] = 1.0
nodes["sign"] = "+"
# creates with the node type without headers
nodes.to_csv(filename_map[node_type],index=False,sep="\t",columns=["NODEID", "prize", "sign"],header=False)

# create the network of edges
edges = data.get_interactome()

edges = convert_directed_to_undirected(edges)

edges["type"] = "-a>"
# drop the weight column
edges = edges.drop(columns=["Weight"])
# creates the edges files that contains the head and tail nodes and the weights after them
edges.to_csv(filename_map["edges"],sep="\t",index=False,columns=["Interactor1", "type", "Interactor2"],header=False)

# Skips parameter validation step
@staticmethod
def run(edges=None, sources=None, targets=None, output_file=None, s: float = 1.0 , c: int = 3 , p: int = 1000, pagerank: bool = False, all_paths: bool = False, container_settings=None):
"""
Run TieDIE with Docker
@param source: input node types with sources (required)
@param target: input node types with targets (required)
@param edges: input edges file (required)
@param output_file: path to the output pathway file (required)
@param s: Network size control factor (optional) (default 1)
@param d_expr: List of significantly differentially expressed genes, along with log-FC or FC values (i.e. by edgeR for RNA-Seq or SAM for microarray data.) Generated by a sample-dichotomy of interest. (optional)
@param a: Linker Cutoff (overrides the Size factor) (optional)
@param c: Search depth for causal paths (optional) (default 3)
@param p: Number of random permutations performed for significance analysis (optional) (default 1000)
@param pagerank: Use Personalized PageRank to Diffuse (optional)
@param all_paths: Use all paths instead of only causal paths (optional) (default False)
@param singularity: if True, run using the Singularity container instead of the Docker container
"""

if not container_settings: container_settings = ProcessedContainerSettings()
if not edges or not sources or not targets or not output_file:
raise ValueError("Required TieDIE arguments are missing")

work_dir = "/spras"

# Each volume is a tuple (src, dest) - data generated by Docker
volumes = list()

bind_path, edges_file = prepare_volume(edges, work_dir, container_settings)
volumes.append(bind_path)

bind_path, sources_file = prepare_volume(sources, work_dir, container_settings)
volumes.append(bind_path)

bind_path, targets_file = prepare_volume(targets, work_dir, container_settings)
volumes.append(bind_path)

out_dir = Path(output_file).parent

# TieDIE requires that the output directory exist
out_dir.mkdir(parents=True, exist_ok=True)
bind_path, mapped_out_dir = prepare_volume(str(out_dir), work_dir, container_settings)
volumes.append(bind_path) # Use posix path inside the container

command = [
"python",
"/TieDIE/bin/tiedie",
"--up_heats", sources_file,
"--down_heats", targets_file,
"--network", edges_file,
"--size", str(s),
"--depth", str(c),
"--permute", str(p),
"--pagerank", "True" if pagerank else "False",
"--all_paths", "False" if all_paths else "False",
"--output_folder", mapped_out_dir,
]

print('Running TieDIE with arguments: {}'.format(' '.join(command)), flush=True)

container_suffix = 'tiedie:v1'
run_container_and_log('TieDIE',
container_suffix,
command,
volumes,
work_dir,
out_dir,
container_settings)

# Rename the primary output file to match the desired output filename
output = Path(out_dir, "tiedie.sif")
target = Path(output_file)
output.rename(target)

@staticmethod
def parse_output(raw_pathway_file, standardized_pathway_file, params):
"""
Convert a predicted pathway into the universal format
@param raw_pathway_file: pathway file produced by an algorithm's run function
@param standardized_pathway_file: the same pathway written in the universal format
"""
df = raw_pathway_df(raw_pathway_file, sep='\t', header=None)
if not df.empty:
# get rid of the relationship (second) column (since all relationships are the same "-a>")
df = df.drop(df.columns[1], axis=1)
df = add_rank_column(df)
df = reinsert_direction_col_undirected(df)
df. columns = ['Node1', 'Node2', 'Rank', "Direction"]
df, has_duplicates = duplicate_edges(df)
if has_duplicates:
print(f"Duplicate edges were removed from {raw_pathway_file}")
df.to_csv(standardized_pathway_file, header=True, index=False, sep='\t')
Empty file added test/TieDIE/__init__.py
Empty file.
11 changes: 11 additions & 0 deletions test/TieDIE/input/pathway1.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
G -a> N
G -a> L
B -a> E
C -a> G
E -a> F
B -a> F
D -a> G
F -a> G
K -a> G
A -a> E
E -a> G
6 changes: 6 additions & 0 deletions test/TieDIE/input/pathway2.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
A -a> D
B -a> D
C -a> D
D -a> E
D -a> F
D -a> G
4 changes: 4 additions & 0 deletions test/TieDIE/input/source1.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
B 1 +
K 1 +
D 1 +
A 1 +
3 changes: 3 additions & 0 deletions test/TieDIE/input/source2.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
A 2 +
B 9 +
C 4 +
4 changes: 4 additions & 0 deletions test/TieDIE/input/target1.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
L 1 +
F 1 +
C 1 +
N 1 +
3 changes: 3 additions & 0 deletions test/TieDIE/input/target2.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
E 3 +
F 1 +
G 2 +
86 changes: 86 additions & 0 deletions test/TieDIE/test_tiedie.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,86 @@
import shutil
from pathlib import Path

import pytest

from spras.config.container_schema import ContainerFramework, ProcessedContainerSettings
from spras.tiedie import TieDIE

TEST_DIR = Path('test', 'TieDIE')
OUT_FILES = TEST_DIR / 'output' / 'output1' / 'tiedie_pathway.txt'
OUT_FILES_1 = TEST_DIR / 'output' / 'output2' / 'tiedie_pathway_alternative.txt'

class TestTieDIE:
"""
Run the TieDIE algorithm on the example input files
"""

def test_tiedie_required(self):
out_path = Path(OUT_FILES)
out_path.unlink(missing_ok=True)
# Only include required arguments
TieDIE.run(sources=TEST_DIR / 'input' / 'source1.txt',
targets=TEST_DIR / 'input' / 'target1.txt',
edges=TEST_DIR / 'input' / 'pathway1.txt',
output_file=OUT_FILES)
assert out_path.exists()

def test_tiedie_alternative_graph(self):
out_path = Path(OUT_FILES_1)
out_path.unlink(missing_ok=True)
TieDIE.run(sources=TEST_DIR / 'input' / 'source2.txt',
targets=TEST_DIR / 'input' / 'target2.txt',
edges=TEST_DIR / 'input' / 'pathway2.txt',
output_file=OUT_FILES_1)
assert out_path.exists()

def test_tiedie_some_optional(self):
out_path = Path(OUT_FILES)
out_path.unlink(missing_ok=True)
# Include optional argument
TieDIE.run(sources=TEST_DIR / 'input' / 'source1.txt',
targets=TEST_DIR / 'input' / 'target1.txt',
edges=TEST_DIR / 'input' / 'pathway1.txt',
output_file=OUT_FILES,
s=1.1,
p=2000,
pagerank = True)
assert out_path.exists()

def test_tiedie_all_optional(self):
out_path = Path(OUT_FILES)
out_path.unlink(missing_ok=True)
# Include optional argument
TieDIE.run(sources=TEST_DIR / 'input' / 'source1.txt',
targets=TEST_DIR / 'input' / 'target1.txt',
edges=TEST_DIR / 'input' / 'pathway1.txt',
output_file=OUT_FILES,
s=1.1,
c=4,
p=2000,
pagerank=True,
all_paths=True)
assert out_path.exists()

def test_tiedie_missing(self):
# Test the expected error is raised when required arguments are missing
with pytest.raises(ValueError):
# No edges file
TieDIE.run(sources=TEST_DIR / 'input' / '/source1.txt',
targets=TEST_DIR / 'input' / '/target1.txt',
output_file=OUT_FILES)

@pytest.mark.skipif(not shutil.which('singularity'), reason='Singularity not found on system')
def test_tiedie_singularity(self):
out_path = Path(OUT_FILES)
out_path.unlink(missing_ok=True)
# Only include required arguments and run with Singularity
TieDIE.run(sources=TEST_DIR / 'input' / 'source1.txt',
targets=TEST_DIR / 'input' / 'target1.txt',
edges=TEST_DIR / 'input' / 'pathway1.txt',
output_file=OUT_FILES,
s=1.1,
p=2000,
pagerank=True,
container_settings=ProcessedContainerSettings(framework=ContainerFramework.singularity))
assert out_path.exists()
2 changes: 2 additions & 0 deletions test/generate-inputs/expected/tiedie-edges-expected.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
test_A -a> B
B -a> C
3 changes: 2 additions & 1 deletion test/generate-inputs/test_generate_inputs.py
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,8 @@
'bowtiebuilder': 'edges',
'strwr': 'network',
'rwr': 'network',
'responsenet': 'edges'
'responsenet': 'edges',
'tiedie': 'edges',
}


Expand Down
10 changes: 10 additions & 0 deletions test/parse-outputs/expected/tiedie-pathway-expected.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
Node1 Node2 Rank Direction
A E 1 U
B E 1 U
B F 1 U
D G 1 U
E F 1 U
E G 1 U
G L 1 U
G N 1 U
G K 1 U
20 changes: 20 additions & 0 deletions test/parse-outputs/input/duplicate-edges/tiedie-raw-pathway.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
F -a> G
E -a> G
B -a> E
E -a> F
A -a> E
G -a> L
K -a> G
B -a> F
D -a> G
G -a> N
F -a> G
E -a> G
B -a> E
E -a> F
A -a> E
G -a> L
K -a> G
B -a> F
D -a> G
G -a> N
Empty file.
Loading
Loading