You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Mar 16, 2022. It is now read-only.
I have been trying to run Falcon on a single machine but the config file is somewhat complicated to understand and after 8 days (200h) running, the process is still working on read error correction. For comparison, Canu takes less than 2 days to fully complete this assembly.
What would be the best config settings for assembling the Drosophila genome (~180Mb) in a 48-cores 500GB RAM machine?
[General]
job_type = local
# list of fasta files
input_fofn = input.fofn
# input type, raw or pre-assembled reads (preads, error corrected reads)
input_type = raw
#input_type = preads
# The length cutoff used for seed reads used for error correction.
# "-1" indicates FALCON should calculate the cutoff using
# the user-defined genome length and coverage cut off
# otherwise, user can specify length cut off in bp (e.g. 2000)
length_cutoff = -1
genome_size = 180000000
seed_coverage = 30
# The length cutoff used for overalpping the preassembled reads
length_cutoff_pr = 12000
## resource usage ##
# need this line even if running local
jobqueue = your_queue
# grid settings for...
# daligner step of raw reads
sge_option_da = -pe smp 5 -q %(jobqueue)s
# las-merging of raw reads
sge_option_la = -pe smp 20 -q %(jobqueue)s
# consensus calling for preads
sge_option_cns = -pe smp 12 -q %(jobqueue)s
# daligner on preads
sge_option_pda = -pe smp 6 -q %(jobqueue)s
# las-merging on preads
sge_option_pla = -pe smp 16 -q %(jobqueue)s
# final overlap/assembly
sge_option_fc = -pe smp 24 -q %(jobqueue)s
# job concurrency settings for...
# all jobs
default_concurrent_jobs = 30
# preassembly
pa_concurrent_jobs = 30
# consensus calling of preads
cns_concurrent_jobs = 30
# overlap detection
ovlp_concurrent_jobs = 30
# daligner parameter options for...
# https://dazzlerblog.wordpress.com/command-guides/daligner-command-reference-guide/
# initial overlap of raw reads
pa_HPCdaligner_option = -v -B4 -t16 -e.70 -l1000 -s1000
# overlap of preads
ovlp_HPCdaligner_option = -v -B4 -t32 -h60 -e.96 -l500 -s1000
# parameters for creation of dazzler database of...
# https://dazzlerblog.wordpress.com/command-guides/dazz_db-command-guide/
# raw reads
pa_DBsplit_option = -x500 -s50
# preads
ovlp_DBsplit_option = -x500 -s50
# settings for consensus calling for preads
falcon_sense_option = --output_multi --min_idt 0.70 --min_cov 4 --max_n_read 200 --n_core 6
# setting for filtering of final overlap of preads
overlap_filtering_setting = --max_diff 100 --max_cov 100 --min_cov 20 --bestn 10 --n_core 24
The text was updated successfully, but these errors were encountered:
Hey guys,
I have been trying to run Falcon on a single machine but the config file is somewhat complicated to understand and after 8 days (200h) running, the process is still working on read error correction. For comparison, Canu takes less than 2 days to fully complete this assembly.
What would be the best config settings for assembling the Drosophila genome (~180Mb) in a 48-cores 500GB RAM machine?
The text was updated successfully, but these errors were encountered: