CycloSeq

Pipelines to process Cyclomics data.

This pipeline uses prior information from the backbone to increase the effectiveness of consensus calling from the circular DNA protocol by Cyclomics.

Dependencies

Click for installation instructions:

Nextflow
Docker or Conda or Apptainer/singularity
Acces to the Github repo and a valid PAT token

Data requirements

data output by ONT Guppy (SUP preferred for optimal results)
Reference genome, Ideally pre indexed by BWA to reduce runtime.

System requirements

The pipeline expects at least 16 threads to be available and 16GB of RAM. We recommend 64 GB of RAM to decrease the runtime significantly.

Reference genome

The pipeline has been developed with amplicons that map against the provided reference in mind.

We suggest to use Grch38.p14, since this works well with the VEP that is integrated in the pipeline its available via the code snippet below.

wget https://ftp.ncbi.nlm.nih.gov/genomes/all/GCA/000/001/405/GCA_000001405.29_GRCh38.p14/GCA_000001405.29_GRCh38.p14_genomic.fna.gz
gunzip GCA_000001405.29_GRCh38.p14_genomic.fna.gz

To reduce runtime pre index the reference genome with BWA, or obtain a preindexed copy.

Usage

Inside of Epi2Me Labs

This pipeline is compatible with the EPI2ME Labs platform by ONT. Please see ONT's installation guide.

Installation inside EPI2ME Labs:

go to workflows by clicking on "installed workflows", or click the workflows icon in the top bar.
click "Import workflow".
Paste "https://github.com/cyclomics/cyclomicsseq" into the text bar and click Import workflow.

As a nextflow pipeline

In this section we assume that you have docker and nextflow installed on your system, if so running the pipeline is straightforward. You can run the pipeline directly from this repo, or pull it yourself and point nextflow towards it.

nextflow run cyclomics/cycmomicsseq -r <pipeline version> -profile docker --input_read_dir '/sequencing/20220209_1609_X3_FAS06478_0ed4361c/fastq_pass/' --output_dir '/data/myresults' --reference '/data/reference/chm13v2.fasta' --backbone BB12

Singularity

If docker is not an option, singularity (or Apptainer, as it is called since Q2 2022) is a good alternative that does not require root access and therefor used in shared compute environments.

The command becomes:

nextflow run cyclomics/cycloseq -profile singularity ...'

Please note that this assumes you've ran the pipeline before, if not add the -user flag as described in Usage[#Usage].

Conda

The pipeline is fully compatible with Conda. This means the full command becomes:

nextflow run cyclomics/cycloseq -profile conda ...'

By default it uses the environment file that is shipped with the pipeline. this file is located in the repo, the pipeline needs to know where this file is to run with the correct versions of the required software.

Flag descriptions

flag	info
--input_read_dir	Directory where the output fastqs of Guppy are located, e.g.: "/data/guppy/exp001/fastq_pass".
--read_pattern	Regex pattern to look for fastq's in the read directory, defaults to: "**.{fq,fastq,fq.gz,fastq.gz}".
--sequencing_summary_path	The summary file generated by guppy, Optional, default: "sequencing_summary*.txt".
--backbone	Select a preset backbone.
--backbone_file	File to use as backbone when --backbone is non of the available presets. eg a fasta file with a sequence with the name ">BB_custom" the name must start with BB for extraction reasons.
--reference	Path to the reference genome to use, will ingest all index files in the same directory.
--output_dir	Directory path where the results, including intermediate files, are stored.
--snp_filters.min_dir_ratio, --indel_filters.min_dir_ratio	Minimum ratio of variant-supporting reads in each direction (default: 0.001 (SNP); 0.002 (Indel)).
--snp_filters.min_dir_count, --indel_filters.min_dir_count	Minimum number of variant-supporting reads in each direction (default: 5).
--snp_filters.min_dpq, --indel_filters.min_dpq	Minimum positional depth after Q filtering (default: 5_000).
--snp_filters.min_dpq_n, --indel_filters.min_dpq_n	Number of flanking nucleotides to the each position that will determine the window size for local maxima calculation (default = 25).
--snp_filters.min_dpq_ratio, --indel_filters.min_dpq_ratio	Ratio of local depth maxima that will determine the minimum depth at each position (default = 0.3).
--snp_filters.min_vaf, --indel_filters.min_vaf	Minimum variant allele frequency (default: 0.003 (SNP); 0.004 (Indel)).
--snp_filters.min_rel_ratio, --indel_filters.min_rel_ratio	Minimum relative ratio between forward and reverse variant-supporting reads (default: 0.3 (SNP); 0.4 (Indel)).
--snp_filters.min_abq, --indel_filters.min_abq	Minimum average base quality (default: 70).

Using alternative Backbones

Due to lab conditions a different backbone might be used. the --backbone parameter can be set to any fasta file. The following defaults are available by default in the pipeline, you can enable them by copying the value in the the value column and pasting it behind the cli command.

backbone	value	default
BB22	--backbone BB22
BB25	--backbone BB25
BB41	--backbone BB41
BB42	--backbone BB42	X
BBCS	--backbone BBCS
BBCR	--backbone BBCR

Roadmap / Todo:

Performance:

Multi-threaded variant calling

changelog

Please see CHANGELOG.md

Install-dependencies

Nextflow

Download the latest version by running the example below:

wget -qO- https://get.nextflow.io | bash

or see The official Nextflow documentation

Conda

Download the latest conda version from The official conda documentation

Run the below command and follow process:

bash Miniconda3-latest-Linux-x86_64.sh

Apptainer/Singularity

Download the latest version by running the example below:

wget https://github.com/apptainer/apptainer/releases/download/v1.1.0-rc.2/apptainer_1.1.0-rc.2_amd64.deb
sudo apt-get install -y ./apptainer_1.1.0-rc.2_amd64.deb

for the latest up to date information see their official documentation

Running on A SLURM cluster such as UMCU HPC

login to the hpc using SSH. there start a sjob with:

srun --job-name "InteractiveJob" --cpus-per-task 16 --mem=32G --gres=tmpspace:450G --time 24:00:00 --pty bash

go to the right project folder

cd /hpc/compgen/projects/cyclomics/cycloseq/pipelines/cycloseq/

start the pipeline as normal

Developer notes

Cycas addition to the repo

Cycas was added as a subtree using code from: https://gist.github.com/SKempin/b7857a6ff6bddb05717cc17a44091202. This was done instead of submodule to make pulling of the repo easier for endusers and to stay compatible with nextflow run functionallity.

more specifically:

git subtree add --prefix Cycas https://github.com/cyclomics/Cycas 0.4.3 --squash

To update run

git subtree pull --prefix Cycas https://github.com/cyclomics/Cycas <tag> --squash

Name		Name	Last commit message	Last commit date
Latest commit History 617 Commits
.github/workflows		.github/workflows
Cycas		Cycas
backbones		backbones
bin		bin
contaminants		contaminants
subworkflows		subworkflows
tests		tests
.dockerignore		.dockerignore
.gitattributes		.gitattributes
.gitignore		.gitignore
.gitmodules		.gitmodules
CHANGELOG.md		CHANGELOG.md
LICENSE		LICENSE
README.md		README.md
__init__.py		__init__.py
create_docker.sh		create_docker.sh
cycas.png		cycas.png
dockerfile		dockerfile
environment.yml		environment.yml
main.nf		main.nf
nextflow.config		nextflow.config
nextflow_schema.json		nextflow_schema.json
pipeline overview.png		pipeline overview.png
sequencing_summary_dummy.txt		sequencing_summary_dummy.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CycloSeq

Dependencies

Data requirements

System requirements

Reference genome

Usage

Inside of Epi2Me Labs

As a nextflow pipeline

Singularity

Conda

Flag descriptions

Using alternative Backbones

Roadmap / Todo:

Performance:

changelog

Install-dependencies

Nextflow

Conda

Apptainer/Singularity

Running on A SLURM cluster such as UMCU HPC

Developer notes

Cycas addition to the repo

About

Releases

Packages

Contributors 2

Languages

License

cyclomics/cyclomicsseq

Folders and files

Latest commit

History

Repository files navigation

CycloSeq

Dependencies

Data requirements

System requirements

Reference genome

Usage

Inside of Epi2Me Labs

As a nextflow pipeline

Singularity

Conda

Flag descriptions

Using alternative Backbones

Roadmap / Todo:

Performance:

changelog

Install-dependencies

Nextflow

Conda

Apptainer/Singularity

Running on A SLURM cluster such as UMCU HPC

Developer notes

Cycas addition to the repo

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages