ZARP (Zavolab Automated RNA-seq Pipeline) is a generic RNA-Seq analysis workflow that allows users to process and analyze Illumina short-read sequencing libraries with minimum effort. Better yet: With our companion ZARP-cli command line interface, you can start ZARP runs with the simplest and most intuitive commands.
RNA-seq analysis doesn't get simpler than that!
The workflow is developed in Snakemake, a widely used workflow management system in the bioinformatics community. ZARP will pre-process, align and quantify your single- or paired-end stranded bulk RNA-seq sequencing libraries with publicly available state-of-the-art bioinformatics tools. ZARP's browser-based rich reports and visualizations will give you meaningful initial insights in the quality and composition of your sequencing experiments - fast and simple. Whether you are an experimentalist struggling with large scale data analysis or an experienced bioinformatician, when there's RNA-seq data to analyze, just zarp 'em!
For the full documentation please visit the ZARP website.
IMPORTANT: Rather than installing the ZARP workflow as described in this section, we recommend installing ZARP-cli for most use cases! If you follow its installation instructions, you can skip the instructions below.
Quick installation requires the following:
- Linux
- Git
- Conda >= 24.11.3
- Optional: Apptainer >=1.3.6 (required only if you want to use containers to manage tool environments)
git clone https://github.com/zavolanlab/zarp.git
cd zarp
conda env create -f install/environment.yml
conda activate zarp
You can trigger ZARP without ZARP-cli. This is convenient for users who have some experience with Snakemake and don't want to use a CLI to trigger their runs. Extensive documentation of the usage is available in the usage documentation, while below you can find the basic steps to trigger a run.
-
Assuming that your current directory is the workflow repository's root directory, create a directory for your workflow run and move into it with:
mkdir config/my_run cd config/my_run
-
Create an empty sample table and a workflow configuration file:
touch samples.tsv touch config.yaml
-
Use your editor of choice to populate these files with appropriate values. Have a look at the examples in the
tests/
directory to see what the files should look like, specifically: -
Create a runner script. Pick one of the following choices for either local or cluster execution. Before execution of the respective command, you need to remember to update the argument of the
--apptainer-args
option of a respective profile (file:profiles/{profile}/config.yaml
) so that it contains a comma-separated list of all directories containing input data files (samples and any annotation files etc) required for your run.Runner script for local execution:
cat << "EOF" > run.sh #!/bin/bash snakemake \ --profile="../../profiles/local-apptainer" \ --configfile="config.yaml" EOF
OR
Runner script for Slurm cluster execution (note that you may need to modify the arguments to
--jobs
and--cores
in the file:profiles/slurm-apptainer/config.yaml
depending on your HPC and workload manager configuration):cat << "EOF" > run.sh #!/bin/bash mkdir -p logs/cluster_log snakemake \ --profile="../profiles/slurm-apptainer" \ --configfile="config.yaml" EOF
Note: When running the pipeline with Conda you should use
local-conda
andslurm-conda
profiles instead.Note: The slurm profiles are adapted to a cluster that uses the quality-of-service (QOS) keyword. If QOS is not supported by your slurm instance, you have to remove all the lines with "qos" in
profiles/slurm-config.json
. -
Start your workflow run:
bash run.sh
This project lives off your contributions, be it in the form of bug reports, feature requests, discussions, or fixes and other code changes. Please refer to the contributing guidelines if you are interested to contribute. Please mind the code of conduct for all interactions with the community.
For questions or suggestions regarding the code, please use the [issue tracker][issue-tracker]. For any other inquiries, please contact us by email.