Skip to content

anthfm/ExomeProcess

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Snakemake Exome Workflow

Snakemake workflow for exome processing.

The workflow contains 5 separate Snakemake pipelines, in order to effectively parallelize two key steps that have a long runtime and rather complex execution (MuTect2, VCFtoMAF)

Please follow the steps below to run the workflow (NOTE: ensure you change some file paths in Snakefiles (e.g. /path/to/S04380110_Covered.headless.bed):

  1. git clone this repo
  2. cd into ExomeProcess dir.
  3. cd into each step in order (step 1 - 5) and submit respective step to H4H like the following example: sbatch snake_submit_step1.sh

(NOTE: in step 2, run snake_submit_step2.sh before snake_submit_step2merge.sh)

Below are details of the processes run in each step:

Step 1: (executes bwa alignment, picard MarkDuplicates, and GATK preprocessing steps)
Step 2: (executes MuTect2 in parallelized manner on split BED file - approx 20 min per sample runtime)
Step 2: (merges MuTect2 parallelized outputs per sample)
Step 3: (executes MuTect1, MuTect2 filtering, Varscan (CN, Somatic), Strelka, Sequenza, VCFIntersect)
Step 4: (executes hg19tohg38LiftOver)
Step 5: (executes VCFtoMAF. Make sure you run all .sh files for both hg19 and hg38)

OncoKb-Annotator was ran on all MAF's, but not using Snakemake, as it required Samwise with internet access. Jobs had to be parallelized on Samwise with *screen*, which Snakemake cannot track for validity. Script can be found under `oncokb/run_oncokb.sh`

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published