Dev => Master 2.3.0 #185

edmundmiller · 2024-12-23T01:47:47Z

~~Waiting for #184~~

PR checklist

https://bedtools.readthedocs.io/en/latest/content/tools/sort.html#optional-sorting-behavior

The only thing using more than one thread is samtools

https://github.com/nf-core/phageannotator/blob/f9da24f4beb5775735a28d98327bd92755517a8a/.github/workflows/ci.yml

Not sure how that snuck back in there

…ate columns

Co-authored-by: maxulysse <[email protected]>

Add STAR aligner

Adopt nft-utils

test(#57): Add uniqmap test

Run lsp formatting

github-actions · 2024-12-23T01:49:07Z

`nf-core pipelines lint` overall result: Passed ✅ ⚠️

Posted for pipeline commit 7d9ce8a

+| ✅ 223 tests passed       |+
#| ❔   4 tests were ignored |#
!| ❗   4 tests had warnings |!

❗ Test warnings:

pipeline_todos - TODO string in ro-crate-metadata.json: "description": "
\n \n <source media="(prefers-color-scheme: dark)" srcset="docs/images/nf-core-nascent_logo_dark.png">\n <img alt="nf-core/nascent" src="docs/images/nf-core-nascent_logo_light.png">\n \n
\n\n\n \n\n\n\n\n\n\n\n\n \n\n## Introduction\n\nnf-core/nascent is a bioinformatics best-practice analysis pipeline for nascent transcript (NT) and Transcriptional Start Site (TSS) assays.\n\nThe pipeline is built using Nextflow, a workflow tool to run tasks across multiple compute infrastructures in a very portable manner. It uses Docker/Singularity containers making installation trivial and results highly reproducible. The Nextflow DSL2 implementation of this pipeline uses one container per process which makes it much easier to maintain and update software dependencies. Where possible, these processes have been submitted to and installed from nf-core/modules in order to make them available to all nf-core pipelines, and to everyone within the Nextflow community!\n\nOn release, automated continuous integration tests run the pipeline on a full-sized dataset on the AWS cloud infrastructure. This ensures that the pipeline runs on AWS, has sensible resource allocation defaults set to run on real-world datasets, and permits the persistent storage of results to benchmark between pipeline releases and other analysis sources.The results obtained from the full-sized test can be viewed on the nf-core website.\n\n## Pipeline summary\n\n1. Read QC (FastQC)\n2. Adapter and quality trimming (fastp)\n3. Alignment\n 1. bwa\n 2. bwamem2\n 3. DRAGMAP\n4. Sort and index alignments (SAMtools)\n5. UMI-based deduplication (UMI-tools)\n6. Duplicate read marking (picard MarkDuplicates)\n7. Quality Control\n 1. RSeQC - Various RNA-seq QC metrics\n 2. Preseq - Estimation of library complexity\n 3. BBMap - Analyzes the sequencing coverage\n8. Coverage Graphs\n 1. Create bedGraph coverage files (BEDTools\n 2. Create bigWig coverage files (deeptools)\n9. Transcript identification\n 1. HOMER\n 2. GroHMM\n 3. PINTS\n10. Quantification of Genes and Nascent Transcripts (featureCounts)\n11. Aggregate report describing results and QC from the whole pipeline (MultiQC)\n\n## Usage\n\n> [!NOTE]\n> If you are new to Nextflow and nf-core, please refer to this page on how to set-up Nextflow.Make sure to test your setup with -profile test before running the workflow on actual data.\n\n Describe the minimum required steps to execute the pipeline, e.g. how to prepare samplesheets.\n Explain what rows and columns represent. For instance (please edit as appropriate):\n\nFirst, prepare a samplesheet with your input data that looks as follows:\n\n\ncsv title=\"samplesheet.csv\nsample,fastq_1,fastq_2\nCONTROL_REP1,AEG588A1_S1_L002_R1_001.fastq.gz,AEG588A1_S1_L002_R2_001.fastq.gz\n\n\nEach row represents a fastq file (single-end) or a pair of fastq files (paired end).\n\n\n\nNow, you can run the pipeline using:\n\nbash\nnextflow run nf-core/nascent \\\n -profile <docker/singularity/.../institute> \\\n --input samplesheet.csv \\\n --outdir <OUTDIR>\n\n\n> [!WARNING]\n> Please provide pipeline parameters via the CLI or Nextflow -params-file option. Custom config files including those provided by the -c Nextflow option can be used to provide any configuration except for parameters; see docs.\n\nFor more details and further functionality, please refer to the usage documentation and the parameter documentation.\n\n## Pipeline output\n\nTo see the results of an example test run with a full size dataset refer to the results tab on the nf-core website pipeline page.\nFor more details about the output files and reports, please refer to the\noutput documentation.\n\n## Credits\n\nnf-core/nascent was originally written by Ignacio Tripodi (@ignaciot) and Margaret Gruca (@magruca).\n\nThe pipeline was re-written in Nextflow DSL2 by Edmund Miller (@edmundmiller) and Sruthi Suresh (@sruthipsuresh) from The Functional Genomics Laboratory at The Univeristy of Texas at Dallas\n\nWe thank the following people for their extensive assistance in the development of this pipeline:\n\n- @apeltzer\n- @ewels\n- @drpatelh\n- @pditommaso\n- @FriederikeHanssen\n- Tae Hoon Kim\n- @easterwoods\n\n## Contributions and Support\n\nIf you would like to contribute to this pipeline, please see the contributing guidelines.\n\nFor further information or help, don't hesitate to get in touch on the Slack #nascent channel (you can join with this invite).\n\n## Citations\n\nIf you use nf-core/nascent for your analysis, please cite it using the following doi: 10.5281/zenodo.7245273\n\nAn extensive list of references for the tools used by the pipeline can be found in the CITATIONS.md file.\n\nYou can cite the nf-core publication as follows:\n\n> The nf-core framework for community-curated bioinformatics pipelines.\n>\n> Philip Ewels, Alexander Peltzer, Sven Fillinger, Harshil Patel, Johannes Alneberg, Andreas Wilm, Maxime Ulysse Garcia, Paolo Di Tommaso & Sven Nahnsen.\n>\n> Nat Biotechnol. 2020 Feb 13. doi: 10.1038/s41587-020-0439-x.\n",
pipeline_todos - TODO string in README.md: Describe the minimum required steps to execute the pipeline, e.g. how to prepare samplesheets.
pipeline_todos - TODO string in base.config: Check the defaults for all processes
pipeline_todos - TODO string in methods_description_template.yml: #Update the HTML below to your preferred methods description, e.g. add publication citation for this pipeline

❔ Tests ignored:

files_unchanged - File ignored due to lint config: LICENSE or LICENSE.md or LICENCE or LICENCE.md
files_unchanged - File ignored due to lint config: .github/workflows/linting.yml
files_unchanged - File ignored due to lint config: assets/email_template.html
actions_ci - actions_ci

✅ Tests passed:

files_exist - File found: .gitattributes
files_exist - File found: .gitignore
files_exist - File found: .nf-core.yml
files_exist - File found: .editorconfig
files_exist - File found: .prettierignore
files_exist - File found: .prettierrc.yml
files_exist - File found: CHANGELOG.md
files_exist - File found: CITATIONS.md
files_exist - File found: CODE_OF_CONDUCT.md
files_exist - File found: LICENSE or LICENSE.md or LICENCE or LICENCE.md
files_exist - File found: nextflow_schema.json
files_exist - File found: nextflow.config
files_exist - File found: README.md
files_exist - File found: .github/.dockstore.yml
files_exist - File found: .github/CONTRIBUTING.md
files_exist - File found: .github/ISSUE_TEMPLATE/bug_report.yml
files_exist - File found: .github/ISSUE_TEMPLATE/config.yml
files_exist - File found: .github/ISSUE_TEMPLATE/feature_request.yml
files_exist - File found: .github/PULL_REQUEST_TEMPLATE.md
files_exist - File found: .github/workflows/branch.yml
files_exist - File found: .github/workflows/ci.yml
files_exist - File found: .github/workflows/linting_comment.yml
files_exist - File found: .github/workflows/linting.yml
files_exist - File found: assets/email_template.html
files_exist - File found: assets/email_template.txt
files_exist - File found: assets/sendmail_template.txt
files_exist - File found: assets/nf-core-nascent_logo_light.png
files_exist - File found: conf/modules.config
files_exist - File found: conf/test.config
files_exist - File found: conf/test_full.config
files_exist - File found: docs/images/nf-core-nascent_logo_light.png
files_exist - File found: docs/images/nf-core-nascent_logo_dark.png
files_exist - File found: docs/output.md
files_exist - File found: docs/README.md
files_exist - File found: docs/README.md
files_exist - File found: docs/usage.md
files_exist - File found: main.nf
files_exist - File found: assets/multiqc_config.yml
files_exist - File found: conf/base.config
files_exist - File found: conf/igenomes.config
files_exist - File found: conf/igenomes_ignored.config
files_exist - File found: .github/workflows/awstest.yml
files_exist - File found: .github/workflows/awsfulltest.yml
files_exist - File found: modules.json
files_exist - File found: ro-crate-metadata.json
files_exist - File not found check: .github/ISSUE_TEMPLATE/bug_report.md
files_exist - File not found check: .github/ISSUE_TEMPLATE/feature_request.md
files_exist - File not found check: .github/workflows/push_dockerhub.yml
files_exist - File not found check: .markdownlint.yml
files_exist - File not found check: .nf-core.yaml
files_exist - File not found check: .yamllint.yml
files_exist - File not found check: bin/markdown_to_html.r
files_exist - File not found check: conf/aws.config
files_exist - File not found check: docs/images/nf-core-nascent_logo.png
files_exist - File not found check: lib/Checks.groovy
files_exist - File not found check: lib/Completion.groovy
files_exist - File not found check: lib/NfcoreTemplate.groovy
files_exist - File not found check: lib/Utils.groovy
files_exist - File not found check: lib/Workflow.groovy
files_exist - File not found check: lib/WorkflowMain.groovy
files_exist - File not found check: lib/WorkflowNascent.groovy
files_exist - File not found check: parameters.settings.json
files_exist - File not found check: pipeline_template.yml
files_exist - File not found check: Singularity
files_exist - File not found check: lib/nfcore_external_java_deps.jar
files_exist - File not found check: .travis.yml
nextflow_config - Found nf-schema plugin
nextflow_config - Config variable found: manifest.name
nextflow_config - Config variable found: manifest.nextflowVersion
nextflow_config - Config variable found: manifest.description
nextflow_config - Config variable found: manifest.version
nextflow_config - Config variable found: manifest.homePage
nextflow_config - Config variable found: timeline.enabled
nextflow_config - Config variable found: trace.enabled
nextflow_config - Config variable found: report.enabled
nextflow_config - Config variable found: dag.enabled
nextflow_config - Config variable found: process.cpus
nextflow_config - Config variable found: process.memory
nextflow_config - Config variable found: process.time
nextflow_config - Config variable found: params.outdir
nextflow_config - Config variable found: params.input
nextflow_config - Config variable found: validation.help.enabled
nextflow_config - Config variable found: manifest.mainScript
nextflow_config - Config variable found: timeline.file
nextflow_config - Config variable found: trace.file
nextflow_config - Config variable found: report.file
nextflow_config - Config variable found: dag.file
nextflow_config - Config variable found: validation.help.beforeText
nextflow_config - Config variable found: validation.help.afterText
nextflow_config - Config variable found: validation.help.command
nextflow_config - Config variable found: validation.summary.beforeText
nextflow_config - Config variable found: validation.summary.afterText
nextflow_config - Config variable (correctly) not found: params.nf_required_version
nextflow_config - Config variable (correctly) not found: params.container
nextflow_config - Config variable (correctly) not found: params.singleEnd
nextflow_config - Config variable (correctly) not found: params.igenomesIgnore
nextflow_config - Config variable (correctly) not found: params.name
nextflow_config - Config variable (correctly) not found: params.enable_conda
nextflow_config - Config variable (correctly) not found: params.max_cpus
nextflow_config - Config variable (correctly) not found: params.max_memory
nextflow_config - Config variable (correctly) not found: params.max_time
nextflow_config - Config variable (correctly) not found: params.validationFailUnrecognisedParams
nextflow_config - Config variable (correctly) not found: params.validationLenientMode
nextflow_config - Config variable (correctly) not found: params.validationSchemaIgnoreParams
nextflow_config - Config variable (correctly) not found: params.validationShowHiddenParams
nextflow_config - Config timeline.enabled had correct value: true
nextflow_config - Config report.enabled had correct value: true
nextflow_config - Config trace.enabled had correct value: true
nextflow_config - Config dag.enabled had correct value: true
nextflow_config - Config manifest.name began with nf-core/
nextflow_config - Config variable manifest.homePage began with https://github.com/nf-core/
nextflow_config - Config dag.file ended with .html
nextflow_config - Config variable manifest.nextflowVersion started with >= or !>=
nextflow_config - Config manifest.version does not contain dev for release: 2.3.0
nextflow_config - Config params.custom_config_version is set to master
nextflow_config - Config params.custom_config_base is set to https://raw.githubusercontent.com/nf-core/configs/master
nextflow_config - Lines for loading custom profiles found
nextflow_config - nextflow.config contains configuration profile test
nextflow_config - Config default value correct: params.aligner= bwa
nextflow_config - Config default value correct: params.use_homer_uniqmap= false
nextflow_config - Config default value correct: params.skip_grohmm= false
nextflow_config - Config default value correct: params.grohmm_min_uts= 5
nextflow_config - Config default value correct: params.grohmm_max_uts= 45
nextflow_config - Config default value correct: params.grohmm_min_ltprobb= -100
nextflow_config - Config default value correct: params.grohmm_max_ltprobb= -400
nextflow_config - Config default value correct: params.igenomes_base= s3://ngi-igenomes/igenomes/
nextflow_config - Config default value correct: params.human_pangenomics_base= https://s3-us-west-2.amazonaws.com/human-pangenomics
nextflow_config - Config default value correct: params.custom_config_version= master
nextflow_config - Config default value correct: params.custom_config_base= https://raw.githubusercontent.com/nf-core/configs/master
nextflow_config - Config default value correct: params.publish_dir_mode= copy
nextflow_config - Config default value correct: params.max_multiqc_email_size= 25.MB
nextflow_config - Config default value correct: params.validate_params= true
nextflow_config - Config default value correct: params.pipelines_testdata_base_path= https://raw.githubusercontent.com/nf-core/test-datasets/
files_unchanged - .gitattributes matches the template
files_unchanged - .prettierrc.yml matches the template
files_unchanged - CODE_OF_CONDUCT.md matches the template
files_unchanged - .github/.dockstore.yml matches the template
files_unchanged - .github/CONTRIBUTING.md matches the template
files_unchanged - .github/ISSUE_TEMPLATE/bug_report.yml matches the template
files_unchanged - .github/ISSUE_TEMPLATE/config.yml matches the template
files_unchanged - .github/ISSUE_TEMPLATE/feature_request.yml matches the template
files_unchanged - .github/PULL_REQUEST_TEMPLATE.md matches the template
files_unchanged - .github/workflows/branch.yml matches the template
files_unchanged - .github/workflows/linting_comment.yml matches the template
files_unchanged - assets/email_template.txt matches the template
files_unchanged - assets/sendmail_template.txt matches the template
files_unchanged - assets/nf-core-nascent_logo_light.png matches the template
files_unchanged - docs/images/nf-core-nascent_logo_light.png matches the template
files_unchanged - docs/images/nf-core-nascent_logo_dark.png matches the template
files_unchanged - docs/README.md matches the template
files_unchanged - .gitignore matches the template
files_unchanged - .prettierignore matches the template
actions_awstest - '.github/workflows/awstest.yml' is triggered correctly
actions_awsfulltest - .github/workflows/awsfulltest.yml is triggered correctly
actions_awsfulltest - .github/workflows/awsfulltest.yml does not use -profile test
readme - README Nextflow minimum version badge matched config. Badge: 24.04.2, Config: 24.04.2
readme - README Zenodo placeholder was replaced with DOI.
plugin_includes - No wrong validation plugin imports have been found
pipeline_name_conventions - Name adheres to nf-core convention
template_strings - Did not find any Jinja template strings (0 files)
schema_lint - Schema lint passed
schema_lint - Schema title + description lint passed
schema_lint - Input mimetype lint passed: 'text/csv'
schema_params - Schema matched params returned from nextflow config
system_exit - No System.exit calls found
actions_schema_validation - Workflow validation passed: linting.yml
actions_schema_validation - Workflow validation passed: branch.yml
actions_schema_validation - Workflow validation passed: fix-linting.yml
actions_schema_validation - Workflow validation passed: release-announcements.yml
actions_schema_validation - Workflow validation passed: awsfulltest.yml
actions_schema_validation - Workflow validation passed: template_version_comment.yml
actions_schema_validation - Workflow validation passed: download_pipeline.yml
actions_schema_validation - Workflow validation passed: ci.yml
actions_schema_validation - Workflow validation passed: clean-up.yml
actions_schema_validation - Workflow validation passed: awstest.yml
actions_schema_validation - Workflow validation passed: linting_comment.yml
merge_markers - No merge markers found in pipeline files
modules_json - Only installed modules found in modules.json
multiqc_config - assets/multiqc_config.yml found and not ignored.
multiqc_config - assets/multiqc_config.yml contains report_section_order
multiqc_config - assets/multiqc_config.yml contains export_plots
multiqc_config - assets/multiqc_config.yml contains report_comment
multiqc_config - assets/multiqc_config.yml follows the ordering scheme of the minimally required plugins.
multiqc_config - assets/multiqc_config.yml contains a matching 'report_comment'.
multiqc_config - assets/multiqc_config.yml contains 'export_plots: true'.
modules_structure - modules directory structure is correct 'modules/nf-core/TOOL/SUBTOOL'
base_config - conf/base.config found and not ignored.
modules_config - conf/modules.config found and not ignored.
modules_config - FASTQC found in conf/modules.config and Nextflow scripts.
modules_config - MULTIQC found in conf/modules.config and Nextflow scripts.
modules_config - GFFREAD found in conf/modules.config and Nextflow scripts.
modules_config - FASTP found in conf/modules.config and Nextflow scripts.
modules_config - BWA_MEM found in conf/modules.config and Nextflow scripts.
modules_config - BWAMEM2_MEM found in conf/modules.config and Nextflow scripts.
modules_config - BOWTIE2_ALIGN found in conf/modules.config and Nextflow scripts.
modules_config - DRAGMAP_ALIGN found in conf/modules.config and Nextflow scripts.
modules_config - HISAT2_ALIGN found in conf/modules.config and Nextflow scripts.
modules_config - STAR_ALIGN found in conf/modules.config and Nextflow scripts.
modules_config - PRESEQ_ found in conf/modules.config and Nextflow scripts.
modules_config - BBMAP_PILEUP found in conf/modules.config and Nextflow scripts.
modules_config - BEDTOOLS_GENOMECOV_PLUS found in conf/modules.config and Nextflow scripts.
modules_config - BEDTOOLS_GENOMECOV_MINUS found in conf/modules.config and Nextflow scripts.
modules_config - DEEPTOOLS_BAMCOVERAGE_PLUS found in conf/modules.config and Nextflow scripts.
modules_config - DEEPTOOLS_BAMCOVERAGE_MINUS found in conf/modules.config and Nextflow scripts.
modules_config - DREG_PREP found in conf/modules.config and Nextflow scripts.
modules_config - HOMER_ found in conf/modules.config and Nextflow scripts.
modules_config - HOMER_FINDPEAKS found in conf/modules.config and Nextflow scripts.
modules_config - HOMER_MAKETAGDIRECTORY found in conf/modules.config and Nextflow scripts.
modules_config - HOMER_MAKEUCSCFILE found in conf/modules.config and Nextflow scripts.
modules_config - PINTS_CALLER found in conf/modules.config and Nextflow scripts.
modules_config - GROHMM_PARAMETERTUNING found in conf/modules.config and Nextflow scripts.
modules_config - GROHMM_TRANSCRIPTCALLING found in conf/modules.config and Nextflow scripts.
modules_config - BEDTOOLS_SORT found in conf/modules.config and Nextflow scripts.
modules_config - BEDTOOLS_MERGE found in conf/modules.config and Nextflow scripts.
modules_config - BEDTOOLS_INTERSECT_FILTER found in conf/modules.config and Nextflow scripts.
modules_config - BEDTOOLS_INTERSECT found in conf/modules.config and Nextflow scripts.
modules_config - SUBREAD_FEATURECOUNTS_GENE found in conf/modules.config and Nextflow scripts.
modules_config - BED2SAF found in conf/modules.config and Nextflow scripts.
modules_config - SUBREAD_FEATURECOUNTS_PREDICTED found in conf/modules.config and Nextflow scripts.
nfcore_yml - Repository type in .nf-core.yml is valid: pipeline
nfcore_yml - nf-core version in .nf-core.yml is set to the latest version: 3.1.1
version_consistency - Version tags are numeric and consistent between container, release tag and config.
included_configs - Pipeline config includes custom configs.

Run details

nf-core/tools version 3.1.1
Run at 2024-12-29 04:09:25

Bump version for 2.3.0 release

sateeshperi

Those local module & sub-wfs will need nf-test next

sateeshperi · 2025-01-08T07:56:12Z

docs/usage.md

+
+When running the pipeline with groHMM as a transcript identification method, the pipeline will automatically perform a parameter tuning process. This process is unique to the groHMM transcript identification method and is designed to select the optimal hold-out parameters for the groHMM algorithm. See [this issue](https://github.com/dankoc/groHMM/issues/4) for more information.
+
+In the groHMM vignette, the code is ran using a single mclapply call, which is a scatter gather approach. This is not ideal for large datasets, because it ends up being bottlenecked by the memory available on your local machine. To improve this, we have written a Nextflow script that runs the pipeline with a scatter gather approach. This is done by running the pipeline with a single hold-out parameter, and then the next parameter, and so on. This is more memory efficient and scales better to larger datasets. The results are then combined then combined in the end as intended and used in the transcript identification process.


Suggested change

In the groHMM vignette, the code is ran using a single mclapply call, which is a scatter gather approach. This is not ideal for large datasets, because it ends up being bottlenecked by the memory available on your local machine. To improve this, we have written a Nextflow script that runs the pipeline with a scatter gather approach. This is done by running the pipeline with a single hold-out parameter, and then the next parameter, and so on. This is more memory efficient and scales better to larger datasets. The results are then combined then combined in the end as intended and used in the transcript identification process.

In the groHMM vignette, the code is ran using a single mclapply call, which is a scatter gather approach. This is not ideal for large datasets, because it ends up being bottle-necked by the memory available on your local machine. To improve this, we have written a Nextflow script that runs the pipeline with a scatter gather approach. This is done by running the pipeline with a single hold-out parameter, and then the next parameter, and so on. This is more memory efficient and scales better to larger datasets. The results are then combined in the end as intended and used in the transcript identification process.

sateeshperi · 2025-01-08T07:58:09Z

docs/usage.md

+- Mouse: mm10
+- Fly: dm6
+
+**This setting is off by default**


Suggested change

**This setting is off by default**

:::info

**This setting is off by default**

:::

sateeshperi · 2025-01-08T08:01:16Z

main.nf

+params.fasta = getGenomeAttribute('fasta')
+params.gtf = getGenomeAttribute('gtf')
+params.gff = getGenomeAttribute('gff')
+params.gene_bed = getGenomeAttribute('bed12')
+params.bwa_index = getGenomeAttribute('bwa')
+params.bwamem2_index = getGenomeAttribute('bwamem2')
+params.dragmap = getGenomeAttribute('dragmap')
+params.bowtie2_index = getGenomeAttribute('bowtie2')
+params.hisat2_index = getGenomeAttribute('hisat2')
+params.star_index = null
+params.homer_uniqmap = getGenomeAttribute('uniqmap')


Suggested change

params.fasta = getGenomeAttribute('fasta')

params.gtf = getGenomeAttribute('gtf')

params.gff = getGenomeAttribute('gff')

params.gene_bed = getGenomeAttribute('bed12')

params.bwa_index = getGenomeAttribute('bwa')

params.bwamem2_index = getGenomeAttribute('bwamem2')

params.dragmap = getGenomeAttribute('dragmap')

params.bowtie2_index = getGenomeAttribute('bowtie2')

params.hisat2_index = getGenomeAttribute('hisat2')

params.star_index = null

params.homer_uniqmap = getGenomeAttribute('uniqmap')

params.fasta = getGenomeAttribute('fasta')

params.gtf = getGenomeAttribute('gtf')

params.gff = getGenomeAttribute('gff')

params.gene_bed = getGenomeAttribute('bed12')

params.bwa_index = getGenomeAttribute('bwa')

params.bwamem2_index = getGenomeAttribute('bwamem2')

params.dragmap = getGenomeAttribute('dragmap')

params.bowtie2_index = getGenomeAttribute('bowtie2')

params.hisat2_index = getGenomeAttribute('hisat2')

params.star_index = null

params.homer_uniqmap = getGenomeAttribute('uniqmap')

sateeshperi · 2025-01-08T08:03:29Z

nextflow.config

-    tuning_file                = null
+    grohmm_min_uts             = 5
+    grohmm_max_uts             = 45
+    // Depends on how you look at this one... But I figured most will ignore the negative


Suggested change

// Depends on how you look at this one... But I figured most will ignore the negative

// Depends on how you look at this one... But I figured most will ignore the negative

Could you clarify what you mean here ?

vagkaratzas

Nice! Regular test runs smoothly under 10' and seems easy to follow. I left a couple of comments after a first pass. Will come back for more :D

vagkaratzas · 2025-01-10T15:02:40Z

conf/base.config

-    cpus   = { check_max( 1    * task.attempt, 'cpus'   ) }
-    memory = { check_max( 6.GB * task.attempt, 'memory' ) }
-    time   = { check_max( 4.h  * task.attempt, 'time'   ) }
+    // TODO nf-core: Check the defaults for all processes


All local modules seem to have labels, and test seems to run fine. You can probably remove this TODO comment

vagkaratzas · 2025-01-10T15:04:02Z

README.md

@@ -51,7 +53,7 @@ On release, automated continuous integration tests run the pipeline on a full-si
 ## Usage

 > [!NOTE]
-> If you are new to Nextflow and nf-core, please refer to [this page](https://nf-co.re/docs/usage/installation) on how to set-up Nextflow. Make sure to [test your setup](https://nf-co.re/docs/usage/introduction#how-to-run-a-pipeline) with `-profile test` before running the workflow on actual data.
+> If you are new to Nextflow and nf-core, please refer to [this page](https://nf-co.re/docs/usage/installation) on how to set-up Nextflow.Make sure to [test your setup](https://nf-co.re/docs/usage/introduction#how-to-run-a-pipeline) with `-profile test` before running the workflow on actual data.

 <!-- TODO nf-core: Describe the minimum required steps to execute the pipeline, e.g. how to prepare samplesheets.


I guess this is left for last. Just a reminder for minimal description of pipeline steps + metro map

vagkaratzas · 2025-01-10T15:06:32Z

modules/local/dreg_prep/environment.yml

+channels:
+  - conda-forge
+  - bioconda
+  - defaults


I think - defaults should now be removed from every module

vagkaratzas · 2025-01-10T15:19:27Z

bin/grohmm_parametertuning.R

@@ -0,0 +1,170 @@
+#!/usr/bin/env Rscript


I found this: https://nf-co.re/docs/checklists/reviews/pipeline_release_pr#do-local-code-and-modules
which says that bin scripts should have author and MIT license embedded. I don't have them in my pipeline either and not sure how new and mandatory this rule is; just leaving the link and comment here to figure it out

assets/schema_input.json

vagkaratzas · 2025-01-10T15:49:34Z

docs/output.md

@@ -225,7 +244,7 @@ The [Preseq](http://smithlabresearch.org/software/preseq/) package is aimed at p
 <details markdown="1">
 <summary>Output files</summary>

- `bbmap/`
+- `quality_control/bbmap/`
  - `*.coverage.hist.txt`: Histogram of read coverage over each chromosome
  - `*.coverage.stats.txt`: Coverage stats broken down by chromosome including %GC, pos/neg read coverage, total coverage, etc.


Suggested change

- `*.coverage.stats.txt`: Coverage stats broken down by chromosome including %GC, pos/neg read coverage, total coverage, etc.

- `<samplename>.coverage.stats.txt`: Coverage stats broken down by chromosome including %GC, pos/neg read coverage, total coverage, etc.

Similarly, I would change the asterisk to everywhere else (when viable) in the output.md likewise.

vagkaratzas · 2025-01-10T16:14:01Z

docs/output.md

@@ -209,9 +228,9 @@ The majority of RSeQC scripts generate output files which can be plotted and sum
 <details markdown="1">
 <summary>Output files</summary>

- `<ALIGNER>/preseq/`
+- `quality_control/preseq/`
  - `*.lc_extrap.txt`: Preseq expected future yield file.


I also see a <samplename>.c_curve.txt

vagkaratzas · 2025-01-10T16:16:17Z

docs/output.md

@@ -240,7 +259,7 @@ The [Preseq](http://smithlabresearch.org/software/preseq/) package is aimed at p
 <details markdown="1">
 <summary>Output files</summary>

- `bedtools/`
+- `coverage_graphs/`
  - `*.minus.bedGraph`: Sample coverage file (negative strand only) in bedGraph format
  - `*.plus.bedGraph`: Sample coverage file (positive strand only) in bedGraph format


I also see a <samplename>.dreg.bedGraph file

vagkaratzas · 2025-01-10T16:19:08Z

docs/output.md

 ### PINTS

 <details markdown="1">
 <summary>Output files</summary>

- `pints/`
+- `transcript_identification/pints/`
  - `*_bidirectional_peaks.bed`: Bidirectional TREs (divergent + convergent)
  - `*_divergent_peaks.bed`: Divergent TREs


Maybe add (optional) to the two files above, since you don't always receive those (at least not with the test.config).

vagkaratzas · 2025-01-10T16:20:27Z

docs/output.md

@@ -346,7 +371,7 @@ They've also created some bed files that might be useful for analysis.
 <details markdown="1">
 <summary>Output files</summary>

- `<ALIGNER>/featurecounts/`
+- `quantification/featurecounts/`


Under quantification, I got two folders instead: gene and nascent

edmundmiller and others added 30 commits March 20, 2024 12:10

build: Give up and use biocontainers

2c99fcc

style: Add \'s

1419c9e

fix(dreg_prep): Use task.cpus

cbcdf6e

refactor(dreg_prep): Use sort instead of sortBed

e6a297d

https://bedtools.readthedocs.io/en/latest/content/tools/sort.html#optional-sorting-behavior

fix(dreg_prep): Make it a process low

b110704

The only thing using more than one thread is samtools

feat: Start DREG_PREP Subworkflow

899ac94

build: nf-core subworkflows install fastq_align_star

43194a8

feat: Add STAR Alignment

7161b14

build: nf-core modules install star/genomegenerate

6d27d26

feat: Add basic STAR index generation

9916891

fix: Copy over config from rnaseq

df6c876

ci: Use detect-nf-test-changes

95187c4

https://github.com/nf-core/phageannotator/blob/f9da24f4beb5775735a28d98327bd92755517a8a/.github/workflows/ci.yml

test: Use testsDir "."

749d703

Not sure how that snuck back in there

test: Write basic STAR test

ca68e9b

fix: Make sure the gtf is unzipped

9940414

docs: Add info about the sample column concatinating group and replic…

3d4db1d

…ate columns

refactor: Pass bais to transcript identification for PINTS

c49e9d7

feat: Bump CHM13 fasta

2aeb303

fix: Update CHM13

63f5074

chore: Update CHANGELOG

4bd6a11

build: Add params.human-pandgenomics_base

c47db7e

Co-authored-by: maxulysse <[email protected]>

Merge pull request #142 from nf-core/STAR

9561941

Add STAR aligner

Template update for nf-core/tools version 2.14.0

25f8d5d

Template update for nf-core/tools version 2.14.1

5522aab

style: Try to fix linting

25d8803

build: Remove extra "groseq" workflow

0280009

chore: nf-core modules update

47ba2f4

fix: Catch up with module updates

64acae7

chore: nf-core modules update subworkflow

ed8510f

ci: Bump detect-nf-test-changes to 0.0.3

45455b6

edmundmiller and others added 15 commits December 21, 2024 07:27

Merge pull request #182 from nf-core/nft-utils

bb5a3a8

Adopt nft-utils

test(#57): Add uniqmap test

5b21392

chore: nf-core subworkflows update homer_groseq

96a9f4d

Merge pull request #177 from nf-core/homer-uniqmap

2d19d95

test(#57): Add uniqmap test

chore: Fix config selectors

c0d2709

fix: Remove Transcriptome quant from STAR

723e873

Merge pull request #173 from nf-core/fix-config-selectors

7425043

Template update for nf-core/tools version 3.1.0

bfcd97c

Template update for nf-core/tools version 3.1.1

7a4e356

chore: Fix pipeline => nascent

ba48eb2

style: First pass with lsp

e6b3ecd

Merge pull request #180 from nf-core/lsp-formatting

ae6518e

Run lsp formatting

chore: Add new contributors section

1bda499

chore: Capture Template updates

0cb4601

chore: Bump version to 2.3.0

d20685a

edmundmiller self-assigned this Dec 23, 2024

edmundmiller added this to the 2.3.0 milestone Dec 23, 2024

Merge pull request #184 from nf-core/2.3.0-release

7d33c65

Bump version for 2.3.0 release

edmundmiller marked this pull request as ready for review December 23, 2024 13:47

chore: Bump deeptools/bamcoverage

9fe1db2

edmundmiller force-pushed the dev branch from 1c21376 to 9fe1db2 Compare December 23, 2024 16:36

edmundmiller added 4 commits December 23, 2024 10:42

style: Load pipeline-specific nf-core configs

a9b12ce

build: Lock down python version in PINTS

83c2c8a

test: Ignore grohmm output

d6afd86

chore: Bump fastq_align_hisat2

7d9ce8a

edmundmiller force-pushed the dev branch from 128d094 to 7d9ce8a Compare December 29, 2024 04:08

sateeshperi approved these changes Jan 8, 2025

View reviewed changes

vagkaratzas requested changes Jan 10, 2025

View reviewed changes

edmundmiller mentioned this pull request Jan 13, 2025

Add license to bin scripts #188

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Dev => Master 2.3.0 #185

Dev => Master 2.3.0 #185

edmundmiller commented Dec 23, 2024 •

edited

Loading

github-actions bot commented Dec 23, 2024 •

edited

Loading

❗ Test warnings:

\n \n <source media="(prefers-color-scheme: dark)" srcset="docs/images/nf-core-nascent_logo_dark.png">\n <img alt="nf-core/nascent" src="docs/images/nf-core-nascent_logo_light.png">\n \n

❔ Tests ignored:

✅ Tests passed:

Run details

sateeshperi left a comment

sateeshperi Jan 8, 2025

sateeshperi Jan 8, 2025

sateeshperi Jan 8, 2025

sateeshperi Jan 8, 2025

vagkaratzas left a comment

vagkaratzas Jan 10, 2025

vagkaratzas Jan 10, 2025

vagkaratzas Jan 10, 2025

vagkaratzas Jan 10, 2025

vagkaratzas Jan 10, 2025

vagkaratzas Jan 10, 2025

vagkaratzas Jan 10, 2025

vagkaratzas Jan 10, 2025

vagkaratzas Jan 10, 2025

vagkaratzas Jan 10, 2025


		When running the pipeline with groHMM as a transcript identification method, the pipeline will automatically perform a parameter tuning process. This process is unique to the groHMM transcript identification method and is designed to select the optimal hold-out parameters for the groHMM algorithm. See [this issue](https://github.com/dankoc/groHMM/issues/4) for more information.

		In the groHMM vignette, the code is ran using a single mclapply call, which is a scatter gather approach. This is not ideal for large datasets, because it ends up being bottlenecked by the memory available on your local machine. To improve this, we have written a Nextflow script that runs the pipeline with a scatter gather approach. This is done by running the pipeline with a single hold-out parameter, and then the next parameter, and so on. This is more memory efficient and scales better to larger datasets. The results are then combined then combined in the end as intended and used in the transcript identification process.

	// Depends on how you look at this one... But I figured most will ignore the negative
	// Depends on how you look at this one... But I figured most will ignore the negative

	- `*.coverage.stats.txt`: Coverage stats broken down by chromosome including %GC, pos/neg read coverage, total coverage, etc.
	- `<samplename>.coverage.stats.txt`: Coverage stats broken down by chromosome including %GC, pos/neg read coverage, total coverage, etc.

Dev => Master 2.3.0 #185

Are you sure you want to change the base?

Dev => Master 2.3.0 #185

Conversation

edmundmiller commented Dec 23, 2024 • edited Loading

PR checklist

github-actions bot commented Dec 23, 2024 • edited Loading

nf-core pipelines lint overall result: Passed ✅ ⚠️

❗ Test warnings:

\n \n <source media="(prefers-color-scheme: dark)" srcset="docs/images/nf-core-nascent_logo_dark.png">\n <img alt="nf-core/nascent" src="docs/images/nf-core-nascent_logo_light.png">\n \n

❔ Tests ignored:

✅ Tests passed:

Run details

sateeshperi left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

vagkaratzas left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

edmundmiller commented Dec 23, 2024 •

edited

Loading

github-actions bot commented Dec 23, 2024 •

edited

Loading

`nf-core pipelines lint` overall result: Passed ✅ ⚠️