Skip to content
Merged
Show file tree
Hide file tree
Changes from 19 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 5 additions & 3 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,14 +9,16 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

- [#117](https://github.com/nf-core/createtaxdb/pull/117) - Updated to nf-core/tools template 3.5.1 (by @jfy133)
- [#143](https://github/com/nf-core/createtaxdb/pull/143) - Documented how to resovle KrakenUniq unbound variable jellfish issue (❤️ to @flass for suggesting, added by @jfy133)
- [#140](https://github/com/nf-core/createtaxdb/pull/140) - Added MetaCache database building support (❤️ to @ChillarAnand for suggestion, added by @alxndrdiaz and @jfy133)

### `Fixed`

### `Dependencies`

| Tool | Old Version | New Version |
| ------- | ----------- | ----------- |
| nf-core | 3.4.1 | 3.5.1 |
| Tool | Old Version | New Version |
| --------- | ----------- | ----------- |
| nf-core | 3.4.1 | 3.5.1 |
| MetaCache | | 2.5.0 |

### `Deprecated`

Expand Down
4 changes: 4 additions & 0 deletions CITATIONS.md
Original file line number Diff line number Diff line change
Expand Up @@ -46,6 +46,10 @@

> Vågene, Å. J., Herbig, A., Campana, M. G., Robles García, N. M., Warinner, C., Sabin, S., Spyrou, M. A., Andrades Valtueña, A., Huson, D., Tuross, N., Bos, K. I., & Krause, J. (2018). Salmonella enterica genomes from victims of a major sixteenth-century epidemic in Mexico. Nature Ecology & Evolution, 2(3), 520–528. https://doi.org/10.1038/s41559-017-0446-6

- [MetaCache](https://doi.org/10.1093/bioinformatics/btx520)

> Müller, A., Hundt, C., Hildebrandt, A., Hankeln, T., & Schmidt, B. (2017). MetaCache: context-aware classification of metagenomic reads using minhashing. Bioinformatics (Oxford, England), 33(23), 3740–3748. https://doi.org/10.1093/bioinformatics/btx520

- [MultiQC](https://pubmed.ncbi.nlm.nih.gov/27312411/)

> Ewels P, Magnusson M, Lundin S, Käller M. MultiQC: summarize analysis results for multiple tools and samples in a single report. Bioinformatics. 2016 Oct 1;32(19):3047-8. doi: 10.1093/bioinformatics/btw354. Epub 2016 Jun 16. PubMed PMID: 27312411; PubMed Central PMCID: PMC5039924.
Expand Down
2 changes: 2 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -45,6 +45,7 @@ The pipeline is designed to be a companion pipeline to [nf-core/taxprofiler](htt
- [MALT](https://doi.org/10.1038/s41559-017-0446-6)
- [sourmash](https://doi.org/10.21105/joss.06830)
- [sylph](https://doi.org/10.1038/s41587-024-02412-y)
- [MetaCache](https://doi.org/10.1093/bioinformatics/btx520)

## Usage

Expand Down Expand Up @@ -81,6 +82,7 @@ nextflow run nf-core/createtaxdb \
--ganon_build_options='--kmer-size 45' \
--build_diamond \
--diamond_build_options='--no-parse-seqids' \
--build_metacache \
--outdir <OUTDIR>
```

Expand Down
Binary file modified assets/createtaxdb-metromap-diagram-dark.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
201 changes: 131 additions & 70 deletions assets/createtaxdb-metromap-diagram-dark.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified assets/createtaxdb-metromap-diagram-light.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
201 changes: 131 additions & 70 deletions assets/createtaxdb-metromap-diagram-light.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
5 changes: 5 additions & 0 deletions conf/modules.config
Original file line number Diff line number Diff line change
Expand Up @@ -161,4 +161,9 @@ process {
ext.prefix = { "${meta.id}-sylph" }
ext.args = { "${params.sylph_build_options}" }
}

withName: METACACHE_BUILD {
ext.prefix = { "${meta.id}-metacache" }
ext.args = { "${params.metacache_build_options}" }
}
}
1 change: 1 addition & 0 deletions conf/test.config
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,7 @@ params {
build_sourmash_dna = true
build_sourmash_protein = true
build_sylph = true
build_metacache = true

unzip_batch_size = 1

Expand Down
1 change: 1 addition & 0 deletions conf/test_alternatives.config
Original file line number Diff line number Diff line change
Expand Up @@ -57,6 +57,7 @@ params {
malt_mapdb_format = 'mdb'

build_sylph = false
build_metacache = false

// General output options
generate_downstream_samplesheets = true
Expand Down
1 change: 1 addition & 0 deletions conf/test_full.config
Original file line number Diff line number Diff line change
Expand Up @@ -41,6 +41,7 @@ params {
build_sourmash_protein = true
unzip_batch_size = 50
build_sylph = true
build_metacache = true

// General output options
generate_downstream_samplesheets = true
Expand Down
1 change: 1 addition & 0 deletions conf/test_minimal.config
Original file line number Diff line number Diff line change
Expand Up @@ -32,5 +32,6 @@ params {
build_sourmash_dna = false
build_sourmash_protein = false
build_sylph = false
build_metacache = false
generate_downstream_samplesheets = false
}
16 changes: 16 additions & 0 deletions docs/output.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,7 @@ The pipeline is built using [Nextflow](https://www.nextflow.io/) and processes d
- [MALT](#malt) - Database files for MALT
- [sourmash](#sourmash) - Database files for sourmash
- [sylph](#sylph) - Database files for sylph
- [MetaCache](#metacache) - Database files for MetaCache

The pipeline can also generate downstream pipeline input samplesheets.
These are stored in `<outdir>/downstream_samplesheets`.
Expand Down Expand Up @@ -237,6 +238,21 @@ and the k-mer size for which the index was created.

The `<your_database>-sylph.syldb` file can be given to sylph profile itself with `sylph profile <your_database>-sylph.syldb <...>` etc.

### metacache

[MetaCache](https://github.com/muellan/metacache) is a classification system for mapping genomic sequences (short reads, long reads, contigs, ...) from metagenomic samples to their most likely taxon of origin. It uses locality sensitive hashing to quickly identify candidate regions within one or multiple reference genomes.

<details markdown="1">
<summary>Output files</summary>

- `metacache/`
- `<your_database>.meta`: sequence signature database binary file
- `<your_database>.cache0`: sequence signature database binary file

</details>

The `<your_database>-metacache/<your_database>-.meta` file can be given to metacache query itself with `metacache query metacache/<your_database>.meta <...>` etc.

### Downstream samplesheets

The pipeline can also generate input files for the following downstream
Expand Down
12 changes: 7 additions & 5 deletions docs/usage/dev.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,10 +12,12 @@ Does not have to be in this precise order
- [ ] Added `--<profiler>_build_params`
- [ ] Added other profiler-specific parameters (e.g. additional taxonomy files)
- [ ] Format with VSCode Nextflow extension
- [ ] Update the `subworkflows/local/preprocessing/main.nf`
- [ ] Update subworkflow's the if/else (contatenation) sections for either DNA or AA FASTA preprocessing
- [ ] Format with VSCode Nextflow extension
- [ ] Added tools(s) to `workflow/createtaxdb.nf`
- [ ] Added relevant new input files to `take:` block, and pass into from `main.nf` to the `NFCORE_CREATETAXDB` workflow
- [ ] Added relevant modules/subworkflows at the top using `include` statement
- [ ] Add the tool into the `PREPROCESSING` subworkflow's PREPARE if/else (contatenation) section
- [ ] Added the tool-specific if/else statement in the main `createtaxdb.nf`
- [ ] Version and MultiQC (if available) channels mixed
- [ ] Include output channel in workflow `emit` statement
Expand All @@ -27,12 +29,12 @@ Does not have to be in this precise order
- [ ] Format with VSCode Nextflow extension
- [ ] If necessary, added any profiler-specific parameter validation checks to `utils_nfcore_createtaxdb_pipeline` and possible at the top of `createtaxdb.nf`
- [ ] Update tests
- [ ] Include the tool in the `test_minimal.config` (as false), `test.config`, `test_nothing.config`, and `test_full.config`
- [ ] Do a mini test of `test_minimal` to make sure it executes when sole tool
- [ ] Include the tool in the `test_minimal.config` (as false), `test.config` and `test_full.config` (as true), and `test_alternatives.config`, as required.
- [ ] Run a mini test of `test_minimal` to make sure it executes when sole tool
- [ ] Format these files with VSCode Nextflow extension
- [ ] Include the output object in the `tests/test.nf.test` file
- [ ] Re-run nf-test to update snapshot: `nf-test test --tag test --profile +docker --updateSnapshot` (tip: for assertions, borrow from the modules assertions!)
- [ ] Updated Documentation
- [ ] Re-run nf-test to update snapshot: `nf-test test --tag test --profile +docker --update-snapshot` (tip: for assertions, borrow from the modules assertions!)
- [ ] Update Documentation
- [ ] `nf-core pipelines schema build` has been run and updated
- [ ] All additional tool specific pipeline parameters have a additional help entry with the `Modifies tool parameter(s)` quote block
- [ ] Added citation to `citations.md` (citation style: APA 7th edition)
Expand Down
5 changes: 5 additions & 0 deletions docs/usage/faq.md
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,10 @@ We provide a list of required or recommended files, and which pipeline parameter
- a MEGAN 'mapDB' mapping file (`--malt_mapdb`)
- sourmash (no additional files required)
- sylph (no additional files required)
- metacache
- taxonomy name dump file (`--namesdmp`)
- taxonomy nodes dump file (`--nodesdmp`)
- custom seqid2taxid file (`--nucl2taxid`)

\* _will be automatically downloaded if not supplied. You must supply this to the pipeline if on an offline cluster._

Expand Down Expand Up @@ -401,6 +405,7 @@ tar czvf kmcp-krakenuniq.tar.gz krakenuniq/database-kmcp-index/
tar czvf <dbname>-krakenuniq.tar.gz krakenuniq/<dbname>-krakenuniq/
tar czvf <dbname>-ganon.tar.gz ganon/
tar czvf <dbname>-malt.tar.gz malt/malt_index/
tar czvf <dbname>-metacache.tar.gz metacache/
```

## I get an error about `ConcurrentModificationExeception`
Expand Down
5 changes: 5 additions & 0 deletions modules.json
Original file line number Diff line number Diff line change
Expand Up @@ -71,6 +71,11 @@
"git_sha": "41dfa3f7c0ffabb96a6a813fe321c6d1cc5b6e46",
"installed_by": ["modules"]
},
"metacache/build": {
"branch": "master",
"git_sha": "e753770db613ce014b3c4bc94f6cba443427b726",
"installed_by": ["modules"]
},
"multiqc": {
"branch": "master",
"git_sha": "af27af1be706e6a2bb8fe454175b0cdf77f47b49",
Expand Down
7 changes: 7 additions & 0 deletions modules/nf-core/metacache/build/environment.yml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

62 changes: 62 additions & 0 deletions modules/nf-core/metacache/build/main.nf

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

92 changes: 92 additions & 0 deletions modules/nf-core/metacache/build/meta.yml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Loading
Loading