-
Notifications
You must be signed in to change notification settings - Fork 4
1.1. Analysis Pipeline (FastDTLmapper)
In this page, explain FastDTLmapper analysis pipeline in detail.
Input:
Species genomic protein CDS fasta files
(00_user_data/fasta/*.fa)
Output:
OrthoFinder ortholog group results
(01_orthofinder/*)
Method:
Grouping ortholog sequences using OrthoFinder.
Parameter(OrthoFinder):
MCL inflation = 1.5 (default)
Input:
Each OG fasta files
(02_dtl_reconciliation/OGXXXXXXX/OGXXXXXXX.fa)
Output:
Each OG aligned fasta files
(02_dtl_reconciliation/OGXXXXXXX/OGXXXXXXX_aln.fa)
Method:
Align each OG sequences using mafft.
Parameter(mafft):
--auto --anysymbol
Input:
Each OG aligned fasta files
(02_dtl_reconciliation/OGXXXXXXX/OGXXXXXXX_aln.fa)
Output:
Trim each OG aligned sequences files
(02_dtl_reconciliation/OGXXXXXXX/OGXXXXXXX_aln_trim.fa)
Method:
Trim each OG aligned sequences using trimal.
If trimal make all gap sequences in trim process, use non-trim fasta file instead in next step.
Parameter(trimal):
-automated1
Input:
Each OG trim-aligned fasta files
(02_dtl_reconciliation/OGXXXXXXX/OGXXXXXXX_aln_trim.fa)
Output:
Reconstruct each OG gene tree results
(02_dtl_reconciliation/OGXXXXXXX/iqtree/*)
Method:
Reconstruct each OG gene tree using IQ-TREE.
If OG gene number >= 4:
Generate ultra-fast bootstrap gene trees with 1000 replicates.
If OG gene number == 3:
IQ-TREE cannot generate 3 genes tree.
Make simple unrooted 3 gene tree like "(gene1:0.1, gene2:0.1, gene3:0.1);"If OG gene number < 3:
Do nothing.
Parameter(IQ-TREE):
-m TEST -mset JTT,WAG,LG
--ufboot 1000 --boot-trees --wbtl
Input:
Each OG ultra-fast bootstrap gene tree file
(02_dtl_reconciliation/OGXXXXXXX/iqtree/OGXXXXXXX.ufboot)
Species tree file
(00_user_data/tree/ultrametric_nodeid_tree.nwk)
Output:
Each OG multifurcation corrected bootstrap gene tree file (02_dtl_reconciliation/OGXXXXXXX/treerecs/OGXXXXXXX_multifurcate.ufboot_recs.nwk)
Method:
Correct each OG bootstrap trees multifurcation using Treerecs.
Number of corrected bootstrap tree is limited to 100, in order to reduce AnGST DTL reconciliation large computational cost in next step.ℹ️ TIPS
IQ-TREE randomly bifurcate topology of identical sequences and it increases DTL cost calculation.
In order to minimize DTL cost in multifurcation, Treerecs correction is useful.
Parameter(Treerecs):
--dupcost 2 --losscost 1 (default)
Treerecs multifurcated gene tree correction example
Input:
Each OG multifurcation corrected bootstrap gene tree file (02_dtl_reconciliation/OGXXXXXXX/treerecs/OGXXXXXXX_multifurcate.ufboot_recs.nwk)
Species tree file
(00_user_data/tree/ultrametric_nodeid_tree.nwk)
Output:
Each OG DTL reconciliation result files
(02_dtl_reconciliation/OGXXXXXXX/angst/*)
Method:
DTL reconciliation of species tree & each OG bootstrap gene trees using AnGST.
ℹ️ TIPS
AnGST estimates DTL events when the number of genes is three or more.
DTL events when the number of genes is less than three are estimated in the next data aggregation step.
Parameter(AnGST):
dup_cost = 2, los_cost = 1, trn_cost = 3 (default)
timetree option = off (default)
Input:
Each OG DTL reconciliation result files
(02_dtl_reconciliation/OGXXXXXXX/angst/*)
Output:
Aggregated and mapped genome-wide DTL event result
(03_aggregate_map_result/*)
Method:
Aggregate and map genome-wide DTL reconciliation result from AnGST results.
If OG gene number >= 3:
Extract DTL event information from AnGST result file.
If OG gene number == 2:
Estimates DTL events by self-implemented method like AnGST.
DTL cost of three case scenario is calculated and lowest DTL cost scenario is selected.Case1: Newly born in one leaf node and then duplication occur.
Case2: Newly born in one internal node and then speciation and loss occur.
Case3: Newly born in one leaf node and then transfer to another leaf node occur.If OG gene number == 1:
Treat target gene as newly born in gene's species node.
- FastDTLmapper
- FastDTLgoea (subtool)
- plot_gain_loss_map (subtool)