Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 7 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,3 +1,10 @@
index_files/*.csv
index_files/*.xlsx
raw_data/*.gz
.nextflow/*
.nextflow.log*
work/*
results/*
bk/
.nextflow/
.nextflow.log*
Expand Down
51 changes: 0 additions & 51 deletions METHODS.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,8 +24,6 @@ The merged workflow supports two input modes:
- `src/rewrite_fastq_barcodes.cpp`: fast C++ implementation for barcode rewriting.
- `tools/build_rewrite_fastq_barcodes.sh`: build script for the C++ binary.
- `envs/cuttag-preprocess.yml`: Conda environment for all required tools.
- `containers/cuttag-preprocess.sif`: default Singularity/Apptainer image path used by the config.
- `containers/cuttag-preprocess.def`: definition file used to build the Singularity/Apptainer image.

## Required software

Expand All @@ -44,30 +42,6 @@ For faster barcode rewriting, build the compiled helper once:

The wrapper used by Nextflow prefers the compiled binary and falls back to Python only if the binary is unavailable.

If you use `singularity` or `apptainer`, the default image path is `containers/cuttag-preprocess.sif`.

You can build that image from the repository with:

```bash
containers/build_singularity_image.sh
```

or manually on a local Linux machine with root or sudo privileges:

```bash
sudo singularity build containers/cuttag-preprocess.sif containers/cuttag-preprocess.def
```

That image should contain:

- `python3`
- `cutadapt`
- `bowtie2`
- `samtools`
- `bedtools`
- `bgzip`
- `bedGraphToBigWig`

## Required inputs

- `--input_mode`: `paired_fastq` or `demux`.
Expand Down Expand Up @@ -184,29 +158,6 @@ nextflow run main.nf -profile slurm,conda \
--out_dir results
```

Run with Singularity or Apptainer:

```bash
nextflow run main.nf -profile singularity \
--input_dir /path/to/fastq \
--barcode_matrix /path/to/barcode_matrix.csv \
--ref /path/to/bowtie2/index_basename \
--chrom_sizes /path/to/genome.chrom.sizes \
--out_dir results
```

Override the default Singularity image location if needed:

```bash
nextflow run main.nf -profile singularity \
--singularity_image /path/to/container.sif \
--input_dir /path/to/fastq \
--barcode_matrix /path/to/barcode_matrix.csv \
--ref /path/to/bowtie2/index_basename \
--chrom_sizes /path/to/genome.chrom.sizes \
--out_dir results
```

## Notes

- The pipeline keeps the same adapter sequence and Bowtie2 arguments used in the existing shell script.
Expand All @@ -216,6 +167,4 @@ nextflow run main.nf -profile singularity \
- The barcode rewrite step is a required part of the workflow and prefers a compiled C++ implementation for speed, while preserving the original Python code as a fallback.
- The alignment step preserves the original high-memory setting (`256 GB`, `16 CPUs`, `18h`) but these can be changed in `nextflow.config`.
- Intermediate files are published into subdirectories under `--out_dir`.
- The Singularity and Apptainer profiles use `containers/cuttag-preprocess.sif` by default and can be overridden with `--singularity_image`.
- The Singularity and Apptainer profiles bind `./data` from the Nextflow launch directory by default. Override with `--container_bind_paths` if references or data are stored elsewhere.
- Sample-name filtering is configurable through `--enable_sample_filter` and `--skip_patterns`, but no samples are excluded unless patterns are provided explicitly.
80 changes: 1 addition & 79 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,16 +20,13 @@ The pipeline covers:
## Main Files

- `main.nf`: main Nextflow workflow
- `nextflow.config`: runtime profiles and resource defaults
- `nextflow.config.temp`: template runtime profiles and resource defaults, please copy to `nextflow.config` and edit as needed
- `bin/rewrite_fastq_barcodes`: wrapper that prefers the compiled barcode-rewrite binary
- `bin/rewrite_fastq_barcodes.py`: barcode rewrite helper extracted from the notebook
- `bin/modify_scict_header.sh`: generic header normalizer used for sciCT demultiplexing mode
- `src/rewrite_fastq_barcodes.cpp`: fast C++ implementation of barcode rewriting
- `tools/build_rewrite_fastq_barcodes.sh`: build script for the C++ binary
- `envs/cuttag-preprocess.yml`: Conda environment definition
- `envs/cuttag-preprocess-container.yml`: lighter Conda environment used inside the Singularity/Apptainer image
- `containers/cuttag-preprocess.def`: Singularity/Apptainer definition file
- `containers/cuttag-preprocess.sif`: default Singularity image path expected by the config
- `METHODS.md`: extended workflow notes and examples

## Faster Barcode Rewriting
Expand Down Expand Up @@ -183,79 +180,6 @@ nextflow run main.nf -profile slurm,conda \
--out_dir results
```

### Singularity / Apptainer

Build the image from the included definition file:

```bash
containers/build_singularity_image.sh
```

On a local Linux machine where you have sudo privileges, the helper runs:

```bash
sudo singularity build containers/cuttag-preprocess.sif containers/cuttag-preprocess.def
```

If you use Apptainer instead:

```bash
sudo apptainer build containers/cuttag-preprocess.sif containers/cuttag-preprocess.def
```

If you cannot build locally, use a remote builder or ask your HPC admins to build it:

```bash
singularity build --remote containers/cuttag-preprocess.sif containers/cuttag-preprocess.def
```

Equivalent helper command:

```bash
containers/build_singularity_image.sh remote
```

Then run:

```bash
nextflow run main.nf -profile singularity \
--input_dir /path/to/fastq \
--barcode_matrix /path/to/barcode_matrix.csv \
--ref /path/to/bowtie2/index_basename \
--chrom_sizes /path/to/genome.chrom.sizes \
--out_dir results
```

To use a different image location:

```bash
nextflow run main.nf -profile singularity \
--singularity_image /path/to/container.sif \
--input_dir /path/to/fastq \
--barcode_matrix /path/to/barcode_matrix.csv \
--ref /path/to/bowtie2/index_basename \
--chrom_sizes /path/to/genome.chrom.sizes \
--out_dir results
```

If your references or data are outside the working directory, bind those filesystem roots into the container. The default bind path is `./data` relative to the directory where you launch Nextflow.

```bash
nextflow run main.nf -profile singularity \
--container_bind_paths ./data \
--input_dir /path/to/fastq \
--barcode_matrix /path/to/barcode_matrix.csv \
--ref /varidata/research/projects/bbc/versioned_references/latest/data/hg38_gencode/indexes/bowtie2/hg38_gencode \
--chrom_sizes /varidata/research/projects/bbc/versioned_references/2024-10-31_10.56.03_v17/data/hg38_gencode/sequence/hg38_gencode.fa.fai \
--out_dir results
```

For multiple roots, provide a comma-separated list:

```bash
--container_bind_paths ./data,/scratch,/home
```

## Defaults

- sample-name filtering is available but no filename patterns are excluded unless `--skip_patterns` is provided
Expand All @@ -269,6 +193,4 @@ For multiple roots, provide a comma-separated list:

- `nextflow` must be installed on the host system.
- The Conda profile creates the software environment automatically from `envs/cuttag-preprocess.yml`.
- The Singularity and Apptainer profiles use `containers/cuttag-preprocess.sif` by default.
- Singularity and Apptainer bind `./data` from the launch directory by default; override with `--container_bind_paths` if needed.
- Sample filtering is controlled by `--enable_sample_filter` and `--skip_patterns`.
51 changes: 0 additions & 51 deletions containers/README.md

This file was deleted.

40 changes: 0 additions & 40 deletions containers/build_singularity_image.sh

This file was deleted.

42 changes: 0 additions & 42 deletions containers/cuttag-preprocess.def

This file was deleted.

18 changes: 0 additions & 18 deletions method.txt

This file was deleted.

24 changes: 0 additions & 24 deletions nextflow.config → nextflow.config.temp
Original file line number Diff line number Diff line change
@@ -1,6 +1,5 @@
manifest {
name = 'cocnt-cuttag-preprocess'
author = 'OpenAI Codex'
homePage = '.'
description = 'Nextflow pipeline for CoCnT/CUT&Tag barcode rewriting, trimming, alignment, BED, and BigWig generation.'
version = '0.1.0'
Expand Down Expand Up @@ -30,10 +29,6 @@ params {
chrom_sizes = null
publish_mode = 'copy'
conda_env = "${projectDir}/envs/cuttag-preprocess.yml"
singularity_image = "${projectDir}/containers/cuttag-preprocess.sif"
// Bind the launch directory's data/ tree by default so local references,
// annotations, and test assets are visible inside Singularity/Apptainer.
container_bind_paths = "${launchDir}/data"
}

process {
Expand Down Expand Up @@ -69,30 +64,11 @@ profiles {

slurm {
process.executor = 'slurm'
singularity.enabled = false
conda.enabled = false
}

conda {
conda.enabled = true
process.conda = params.conda_env
}

singularity {
singularity.enabled = true
apptainer.enabled = false
singularity.autoMounts = true
// Additional roots can be supplied with --container_bind_paths.
singularity.runOptions = "--bind ${params.container_bind_paths}"
process.container = params.singularity_image
}

apptainer {
apptainer.enabled = true
singularity.enabled = false
apptainer.autoMounts = true
// Additional roots can be supplied with --container_bind_paths.
apptainer.runOptions = "--bind ${params.container_bind_paths}"
process.container = params.singularity_image
}
}