Snakemake Exome Workflow

Snakemake workflow for exome processing.

The workflow contains 5 separate Snakemake pipelines, in order to effectively parallelize two key steps that have a long runtime and rather complex execution (MuTect2, VCFtoMAF)

Please follow the steps below to run the workflow (NOTE: ensure you change some file paths in Snakefiles (e.g. /path/to/S04380110_Covered.headless.bed):

git clone this repo
cd into ExomeProcess dir.
cd into each step in order (step 1 - 5) and submit respective step to H4H like the following example: sbatch snake_submit_step1.sh

(NOTE: in step 2, run snake_submit_step2.sh before snake_submit_step2merge.sh)

Below are details of the processes run in each step:

Step 1: (executes bwa alignment, picard MarkDuplicates, and GATK preprocessing steps)
Step 2: (executes MuTect2 in parallelized manner on split BED file - approx 20 min per sample runtime)
Step 2: (merges MuTect2 parallelized outputs per sample)
Step 3: (executes MuTect1, MuTect2 filtering, Varscan (CN, Somatic), Strelka, Sequenza, VCFIntersect)
Step 4: (executes hg19tohg38LiftOver)
Step 5: (executes VCFtoMAF. Make sure you run all .sh files for both hg19 and hg38)

OncoKb-Annotator was ran on all MAF's, but not using Snakemake, as it required Samwise with internet access. Jobs had to be parallelized on Samwise with *screen*, which Snakemake cannot track for validity. Script can be found under `oncokb/run_oncokb.sh`

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
oncokb		oncokb
step1		step1
step2		step2
step3		step3
step4		step4
step5		step5
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Snakemake Exome Workflow

About

Uh oh!

Releases

Packages

Uh oh!

Languages

anthfm/ExomeProcess

Folders and files

Latest commit

History

Repository files navigation

Snakemake Exome Workflow

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages