-
Notifications
You must be signed in to change notification settings - Fork 45
Taxonomic assignment with GTDBtk
Francisco Zorrilla edited this page Mar 22, 2021
·
3 revisions
GTDB-Tk
is implemented in the Snakefile as follows:
rule GTDBtk:
input:
f'{config["path"]["root"]}/dna_bins/{{IDs}}'
output:
directory(f'{config["path"]["root"]}/GTDBtk/{{IDs}}')
benchmark:
f'{config["path"]["root"]}/benchmarks/{{IDs}}.GTDBtk.benchmark.txt'
message:
"""
The folder dna_bins assumes subfolders containing dna bins for refined and reassembled bins.
"""
shell:
"""
set +u;source activate gtdbtk-tmp;set -u;
export GTDBTK_DATA_PATH=/g/scb2/patil/zorrilla/conda/envs/gtdbtk/share/gtdbtk-1.1.0/db/
cd $SCRATCHDIR
cp -r {input} .
gtdbtk classify_wf --genome_dir $(basename {input}) --out_dir GTDBtk -x fa --cpus {config[cores][gtdbtk]}
mkdir -p {output}
mv GTDBtk/* {output}
"""
- Quality filter reads with fastp
- Assembly with megahit
- Draft bin sets with CONCOCT, MaxBin2, and MetaBAT2
- Refine & reassemble bins with metaWRAP
- Taxonomic assignment with GTDB-tk
- Relative abundances with bwa
- Reconstruct & evaluate genome-scale metabolic models with CarveMe and memote
- Species metabolic coupling analysis with SMETANA