Conversation
|
No objections against removing the latest data download. We decided to include the functionality during the pandemic because we thought it was just too much burden for instance admins to keep up with then frequent upstream updates and we did not want users to get outdated lineage classification results because of that. Interestingly, this might be one of those rare cases where not bumping the wrapper version might be good. |
|
Oh, and one more thing. Please update the pangolin-data dependency to 1.36, which was the latest when this version of pangolin got released. Something to consider for the auto-updates in the future. |
tools/pangolin/pangolin.xml
Outdated
There was a problem hiding this comment.
Ah, this one here and the requirement got unlinked. So the current tool ships with 1.36, but reports 1.26 in the interface.
There was a problem hiding this comment.
same for @CONSTELLATIONS_VERSION@, the autoupdate bot just seems to overwrite ("bump") these tokens in the requirements.
|
I think @wm75 is basically correct - the pace of database updates has slowed. That said, I'm trying to find this Other groups have taken the approach of making a Docker container with pangolin with its latest data and updating from time to time. Since the most recently pangolin-data releases have been released on 14 January 2026, 10 November 2025, 11 September 2025 and 16 June 2025, it might now be reasonable to envision a "pangolin+pangolin-data" build recipe upstream. I'm not sure how to do that on the bioconda side, but the end result would be a tool that gets updated every few months and doesn't need data table maintenance or the download option. (P.S. if the download option is removed, it would require an update to the sars-cov-2-pe-illumina-artic-ivar-analysis workflow in IWC) |
Distributing reference data via pip / conda is possible, but maybe not the best approach: it's only data and no dependencies. In the end the package only contains https://github.com/cov-lineages/pangolin-data .. if I'm not wrong? |
I would also prefer to keep the system modular. We've done all the hard work to make it modular and tying data together with code when we don't have to seems wrong. |
This I can do for you. |
wm75
left a comment
There was a problem hiding this comment.
Thanks a lot @bernt-matthias for addressing this - just minor improvements to comment lines.
| <requirement type="package" version="@PANGOLIN_DATA_VERSION@">pangolin-data</requirement> | ||
| <requirement type="package" version="@CONSTELLATIONS_VERSION@">constellations</requirement> |
There was a problem hiding this comment.
| <requirement type="package" version="@PANGOLIN_DATA_VERSION@">pangolin-data</requirement> | |
| <requirement type="package" version="@CONSTELLATIONS_VERSION@">constellations</requirement> | |
| <!-- Important: keep the following two versions tokenized since the tokens are reused in the inputs section!> | |
| <requirement type="package" version="@PANGOLIN_DATA_VERSION@">pangolin-data</requirement> | |
| <requirement type="package" version="@CONSTELLATIONS_VERSION@">constellations</requirement> |
| #end if | ||
| ## Handle data components to be taken from data tables | ||
| ## The folder structure pointed to by the data tables can be used | ||
| ## as is except that cannot symlink the folders themselves since |
There was a problem hiding this comment.
| ## as is except that cannot symlink the folders themselves since | |
| ## as is except that we cannot symlink the folders themselves since |
@wm75 @pvanheus One of the pangolin tests is failing in the weekly tests. Problem is that the download of up-to-date data fails (in conda) with:
In order to fix this we would need to patch pangolin to add
--no-build-isolationto the pip call.IMO it might be better to remove the download functionality completely
What do you think? Would this be acceptable?
FOR CONTRIBUTOR:
There are two labels that allow to ignore specific (false positive) tool linter errors:
skip-version-check: Use it if only a subset of the tools has been updated in a suite.skip-url-check: Use it if github CI sees 403 errors, but the URLs work.