This project provides an automated pipeline for RNA-Seq analysis using READemption and Docker. It handles everything from environment setup and data downloading to differential gene expression analysis and visualization.
.
├── docker_installation_linux.sh # Script to install Docker on Linux Mint/Ubuntu
├── rnaseq.sh # Main analysis pipeline script
└── READemption_analysis/ # Output directory (created during runtime)
├── input/ # Raw reads and reference sequences
└── output/ # Analysis results and plots
The RNA-seq data and analysis pipeline in this repository are associated with the accompanying manuscript:
- File:
manuscript.pdf - Context: This pipeline replicates/tests the data analysis presented in the study. Please refer to the manuscript for detailed biological context and experimental design.
- OS: Linux (Tested on Linux Mint 22.3 / Ubuntu Noble)
- Permissions: Sudo privileges are required for installation and running Docker.
If you haven't installed Docker yet, use the provided setup script. This script installs Docker Engine, CLI, and necessary plugins.
chmod +x docker_installation_linux.sh
./docker_installation_linux.shNote: You may need to log out and log back in for group permission changes to take effect.
The rnaseq.sh script automates the entire analytical workflow.
chmod +x rnaseq.sh
./rnaseq.shThe pipeline performs the following steps automatically:
- Setup: Creates the project directory structure for READemption.
- Data Acquisition:
- Downloads Methanosarcina mazei reference genome (FASTA) and annotations (GFF) from NCBI.
- Downloads raw RNA-seq reads (SRR4018514 - SRR4018517) using
sra-toolsvia Docker.
- Preprocessing:
- Compresses FASTQ files.
- Renames files to sample names and replicates (single ended reads) (e.g.,
wt_R1,mut_R1).
- Analysis (via READemption Docker container):
- Alignment: Maps reads to the reference genome using
segemehl. - Coverage: Calculates nucleotide-wise coverage.
- Gene Quantification: Counts reads per feature (CDS, tRNA, rRNA).
- Differential Expression: Performs DE analysis using DESeq2 comparing Mutant (mut) vs Wild Type (wt).
- Alignment: Maps reads to the reference genome using
- Visualization: Generates plots for alignment statistics, gene quantification, and differential expression.
- SRA Tools:
ncbi/sra-tools(for downloading reads) - READemption:
tillsauerwein/reademption:latest(for analysis)
Results will be located in the READemption_analysis folder:
- Coverage tracks:
output/coverage/ - Gene counts:
output/gene_quanti/ - DESeq2 results:
output/deseq/ - Plots:
output/viz_align/,output/viz_gene_quanti/,output/viz_deseq/