Skip to content

New pipeline: nf-core/nf-detect-seq #129

@emmcauley

Description

@emmcauley

Pipeline title/name

detect-seq

Keywords

genomics, base editor, crispr, off-targets

What is it about?

A pipeline to process Detect-seq sequencing data (generated via dU chemical labeling and biotin pulldown) to identify genome-wide off-target editing sites by programmable base editors.

Please provide a schematic diagram of the proposed pipeline

Image

What would a minimal first release of this pipeline include?

Adapter trimming (cutadapt), genomic alignment (HISAT-3N), recovery of low-quality and unmapped reads via samtools. Custom python can be used as-is from the original publication and pipieline for the MVP.

Plots are out of scope for the minimal first release.

I confirm my proposed pipeline will follow nf-core guidelines. Most importantly, my pipeline will:

  • be built with Nextflow.
  • pass nf-core lint tests and use standardized parameters.
  • be community-owned and developed within the nf-core organization.
  • open source under the MIT license with proper credits and acknowledgments.
  • have a descriptive, all lowercase, and without punctuation name.
  • use the nf-core pipeline template and predominantly use official nf-core modules.
  • focus on a specific data/analysis type with appropriate scope.
  • have properly maintained documentation.
  • be bundled using versioned Docker/Singularity containers.

Why do we need a new pipeline?

While a Snakemake pipeline exists to support this type of analysis, there is no Nextflow equivalent as far as I know. For a tool like Detect-seq that's relevant to therapeutic base editing, having a Nextflow version could significantly expand its user base in pharma and clinical research environments where Nextflow is often the institutional standard.

Several additions to the original pipeline can be made with standard nf-core building blocks (e.g., MultiQC, regression testing with nf-test) to facilitate reproducibility, quality control assessment, and iteration.

Who would be interested?

CRISPR/base editing researchers: Anyone developing or benchmarking new programmable base editors (e.g., CBE variants like BE3, BE4, ABE variants) could use this pipeline to characterize their editor's off-target profile before publication or therapeutic application.

Detect-seq data and analysis could potentially be used as a comparator alongside other approaches like CIRCLE-seq, GUIDE-seq, or CRISPResso2.

What has been done so far

I have created: 1) a HISAT-3N ALIGN nf-core module, a HISAT-3N BUILD nf-core module, and 3) a samclip nf-core module

URL to existing work (if applicable)

No response

Are there any similar existing nf-core pipelines?

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    Status

    No status

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions