Skip to content

Improve lineage diagram #6052

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
May 8, 2025
Merged

Improve lineage diagram #6052

merged 4 commits into from
May 8, 2025

Conversation

bentsherman
Copy link
Member

This PR makes the following improvements to the lineage diagram:

  • Move render code to separate class
  • Separate graph traversal from Mermaid rendering
  • Distinguish task nodes from file nodes

I also removed the rendering of WorkflowRun nodes, because workflow runs are more like boundaries around nodes rather than nodes. We can try to add them later as subgraphs but for now I don't think it's important. We do need to make sure that the lineage can still be traced across multiple workflow runs.

Example lineage graph for the multiqc report produced by rnaseq-nf:

%%{
  init: {
    'theme': 'base',
    'themeVariables': {
      'primaryColor': '#B6ECE2',
      'primaryTextColor': '#160F26',
      'primaryBorderColor': '#065647',
      'lineColor': '#545555',
      'clusterBkg': '#BABCBD22',
      'clusterBorder': '#DDDEDE',
      'fontFamily': 'arial'
    }
  }
}%%
flowchart TB
    lid://9dd03aa13c2f2ecea7879aa1e50dba6e/summary/multiqc_report.html["lid://9dd03aa13c2f2ecea7879aa1e50dba6e/summary/multiqc_report.html"]
    lid://ef897bdf3c292346f76f82b24f7f0385/multiqc_report.html["lid://ef897bdf3c292346f76f82b24f7f0385/multiqc_report.html"]
    lid://ef897bdf3c292346f76f82b24f7f0385(["MULTIQC"])
    https://github.com/nextflow-io/rnaseq-nf/tree/73329b11f2d848c37e2c1ccdd923b327dae99d5f/multiqc["h‎ttps://github.com/nextflow-io/rnaseq-nf/tree/73329b11f2d848c37e2c1ccdd923b327dae99d5f/multiqc"]
    lid://82276016d8c7fbe8855dad0074bec31d/quant["lid://82276016d8c7fbe8855dad0074bec31d/quant"]
    lid://f2f2054d26f5a8457b49c47fa22f023f/fastqc["lid://f2f2054d26f5a8457b49c47fa22f023f/fastqc"]
    lid://82276016d8c7fbe8855dad0074bec31d(["RNASEQ:QUANT (gut)"])
    work/stage-21a4f7d6-11a8-4d56-b249-cc7a811eda6b/00/84ac54f3fe6eaad6bb24961e4f8d63/ggal_gut_1.fq["work/stage-21a4f7d6-11a8-4d56-b249-cc7a811eda6b/00/84ac54f3fe6eaad6bb24961e4f8d63/ggal_gut_1.fq"]
    work/stage-21a4f7d6-11a8-4d56-b249-cc7a811eda6b/43/c22c31954ad7b8f689adf7695c3e3a/ggal_gut_2.fq["work/stage-21a4f7d6-11a8-4d56-b249-cc7a811eda6b/43/c22c31954ad7b8f689adf7695c3e3a/ggal_gut_2.fq"]
    lid://f2f2054d26f5a8457b49c47fa22f023f(["RNASEQ:FASTQC (gut)"])
    work/stage-21a4f7d6-11a8-4d56-b249-cc7a811eda6b/00/84ac54f3fe6eaad6bb24961e4f8d63/ggal_gut_1.fq["work/stage-21a4f7d6-11a8-4d56-b249-cc7a811eda6b/00/84ac54f3fe6eaad6bb24961e4f8d63/ggal_gut_1.fq"]
    work/stage-21a4f7d6-11a8-4d56-b249-cc7a811eda6b/43/c22c31954ad7b8f689adf7695c3e3a/ggal_gut_2.fq["work/stage-21a4f7d6-11a8-4d56-b249-cc7a811eda6b/43/c22c31954ad7b8f689adf7695c3e3a/ggal_gut_2.fq"]
    lid://e93bfb52ddc455823173302ed4dff8d5/index["lid://e93bfb52ddc455823173302ed4dff8d5/index"]
    lid://e93bfb52ddc455823173302ed4dff8d5(["RNASEQ:INDEX (ggal_1_48850000_49020000)"])
    work/stage-21a4f7d6-11a8-4d56-b249-cc7a811eda6b/e7/181415b6783c942ef2d806c8023276/ggal_1_48850000_49020000.Ggal71.500bpflank.fa["work/stage-21a4f7d6-11a8-4d56-b249-cc7a811eda6b/e7/181415b6783c942ef2d806c8023276/ggal_1_48850000_49020000.Ggal71.500bpflank.fa"]
    lid://ef897bdf3c292346f76f82b24f7f0385/multiqc_report.html --> lid://9dd03aa13c2f2ecea7879aa1e50dba6e/summary/multiqc_report.html
    lid://ef897bdf3c292346f76f82b24f7f0385 --> lid://ef897bdf3c292346f76f82b24f7f0385/multiqc_report.html
    lid://82276016d8c7fbe8855dad0074bec31d/quant --> lid://ef897bdf3c292346f76f82b24f7f0385
    lid://f2f2054d26f5a8457b49c47fa22f023f/fastqc --> lid://ef897bdf3c292346f76f82b24f7f0385
    https://github.com/nextflow-io/rnaseq-nf/tree/73329b11f2d848c37e2c1ccdd923b327dae99d5f/multiqc --> lid://ef897bdf3c292346f76f82b24f7f0385
    lid://82276016d8c7fbe8855dad0074bec31d --> lid://82276016d8c7fbe8855dad0074bec31d/quant
    lid://f2f2054d26f5a8457b49c47fa22f023f --> lid://f2f2054d26f5a8457b49c47fa22f023f/fastqc
    lid://e93bfb52ddc455823173302ed4dff8d5/index --> lid://82276016d8c7fbe8855dad0074bec31d
    work/stage-21a4f7d6-11a8-4d56-b249-cc7a811eda6b/00/84ac54f3fe6eaad6bb24961e4f8d63/ggal_gut_1.fq --> lid://82276016d8c7fbe8855dad0074bec31d
    work/stage-21a4f7d6-11a8-4d56-b249-cc7a811eda6b/43/c22c31954ad7b8f689adf7695c3e3a/ggal_gut_2.fq --> lid://82276016d8c7fbe8855dad0074bec31d
    work/stage-21a4f7d6-11a8-4d56-b249-cc7a811eda6b/00/84ac54f3fe6eaad6bb24961e4f8d63/ggal_gut_1.fq --> lid://f2f2054d26f5a8457b49c47fa22f023f
    work/stage-21a4f7d6-11a8-4d56-b249-cc7a811eda6b/43/c22c31954ad7b8f689adf7695c3e3a/ggal_gut_2.fq --> lid://f2f2054d26f5a8457b49c47fa22f023f
    lid://e93bfb52ddc455823173302ed4dff8d5 --> lid://e93bfb52ddc455823173302ed4dff8d5/index
    work/stage-21a4f7d6-11a8-4d56-b249-cc7a811eda6b/e7/181415b6783c942ef2d806c8023276/ggal_1_48850000_49020000.Ggal71.500bpflank.fa --> lid://e93bfb52ddc455823173302ed4dff8d5
Loading

Signed-off-by: Ben Sherman <[email protected]>
@bentsherman bentsherman requested review from pditommaso and jorgee May 8, 2025 02:58
Copy link

netlify bot commented May 8, 2025

Deploy Preview for nextflow-docs-staging ready!

Name Link
🔨 Latest commit 79fc4f2
🔍 Latest deploy log https://app.netlify.com/sites/nextflow-docs-staging/deploys/681cc933d0a2d800082fed16
😎 Deploy Preview https://deploy-preview-6052--nextflow-docs-staging.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

@bentsherman
Copy link
Member Author

#5911 is needed to replace the work stage paths with the source URLs, I will try to get this resolved tomorrow

@jorgee
Copy link
Contributor

jorgee commented May 8, 2025

I also removed the rendering of WorkflowRun nodes, because workflow runs are more like boundaries around nodes rather than nodes. We can try to add them later as subgraphs but for now I don't think it's important.

Not sure if we should remove this case. It was used to print the lineage when an output does not come from a task output. (such as the index file). In this case, the source of this output is the WorkflowRun. The render command was printing a dag with the data, the workflowRun and its params. With this PR it returns an error.

$ nextflow li render lid://ca2192600870bee8b13582afeae3808f/samples.json
ERROR: rendering lineage graph - Cannot render lineage for type WorkflowRun -- must be a FileOutput or TaskRun

before:
image

Signed-off-by: Ben Sherman <[email protected]>
@bentsherman
Copy link
Member Author

Good point, I have updated accordingly

I don't really like having non-file parameter values in the lineage graph as they are meaningless without their name (i.e. a workflow run having "2.0" as an input). But right now there is no way to distinguish between string and file inputs on the workflow run, so I'll leave it for now and we can refine when we have better typed inputs

@bentsherman bentsherman added this to the 25.04 milestone May 8, 2025
@pditommaso pditommaso merged commit 923b842 into master May 8, 2025
21 of 22 checks passed
@pditommaso pditommaso deleted the lineage-improve-render branch May 8, 2025 16:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants