A CWL pipeline for processing CODEX image data, using Cytokit.
- Collect required parameters from metadata files.
- Perform illumination correction with Fiji plugin BaSiC
- Find sharpest z-plane for each channel, using variation of Laplacian
- Perform stitching of tiles using Fiji plugin BigStitcher
- Create Cytokit YAML config file containing parameters from input metadata
- Run Cytokit's
processorcommand to perform tile pre-processing, and nucleus and cell segmentation. - Run Cytokit's
operatorcommand to extract all antigen fluoresence images (discarding blanks and empty channels). - Generate OME-TIFF versions of TIFFs created by Cytokit.
- Stitch tiles with segmentation masks
- Perform downstream analysis using SPRM.
Please use HuBMAP Consortium fork of cwltool
to be able to run pipeline with GPU in Docker and Singularity containers.
For the list of python packages check environment.yml.
cwltool pipeline.cwl subm.yaml
If you use Singularity containers add --singularity. Example of submission file subm.yaml is provided in the repo.
codex_dataset/
src_data OR raw
├── channelnames.txt
├── channelnames_report.csv
├── experiment.json
├── exposure_times.txt
├── segmentation.json
├── Cyc1_reg1 OR Cyc001_reg001
│ ├── 1_00001_Z001_CH1.tif
│ ├── 1_00001_Z001_CH2.tif
│ │ ...
│ └── 1_0000N_Z00N_CHN.tif
└── Cyc1_reg2 OR Cyc001_reg002
├── 2_00001_Z001_CH1.tif
├── 2_00001_Z001_CH2.tif
│ ...
└── 1_0000N_Z00N_CHN.tif
Images should be separated into directories by cycles and regions using the following pattern Cyc{cycle:d}_reg{region:d}.
The file names must contain region, tile, z-plane and channel ids starting from 1, and follow this pattern
{region:d}_{tile:05d}_Z{zplane:03d}_CH{channel:d}.tif.
Necessary metadata files that must be present in the input directory:
experiment.json- acquisition parameters and data structure;segmentation.json- which channel from which cycle to use for segmentation;channelnames.txt- list of channel names, one per row;channelnames_report.csv- which channels to use, and which to exclude;exposure_times.txt- not used at the moment, but will be useful for background subtraction.
Examples of these files are present in the directory metadata_examples.
Note: all fields related to regions, cycles, channels, z-planes and tiles start from 1,
and xyResolution, zPitch are measured in nm.
pipeline_output/
├── expr
│ ├── reg001_expr.ome.tiff
│ └── reg002_expr.ome.tiff
└── mask
├── reg001_mask.ome.tiff
└── reg002_expr.ome.tiff
Where expr directory contains processed images and mask contains segmentation masks.
The output of SPRM will be different, see https://github.com/hubmapconsortium/sprm .
Code in this repository is formatted with black and isort, and this is checked via Travis CI.
A pre-commit hook configuration is provided, which runs black and isort before committing.
Run pre-commit install in each clone of this repository which you will use for development (after pip install pre-commit
into an appropriate Python environment, if necessary).
Two Dockerfiles are included in this repository. A docker_images.txt manifest is included, which is intended
for use in the build_docker_containers script provided by the
multi-docker-build Python package. This package can be installed
with
python -m pip install multi-docker-buildThe master branch is intended to be production-ready at all times, and should always reference Docker containers
with the latest tag.
Publication of tagged "release" versions of the pipeline is handled with the
HuBMAP pipeline release management Python package. To
release a new pipeline version, ensure that the master branch contains all commits that you want to include in the release,
then run
tag_releae_pipeline v0.whateverSee the pipeline release managment script usage notes for additional options, such as GPG signing.