Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Data avaibility per tag #155

Closed
wants to merge 31 commits into from
Closed

Conversation

GiovanniVolta
Copy link

No description provided.

@yuema137 yuema137 self-requested a review February 26, 2025 22:36
Copy link
Member

@cfuselli cfuselli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @GiovanniVolta !
Thanks for this PR. I think a tool like this could be useful, but I still have some doubts about some things and parts of the implementation.

  • Is this tool not very similar to xefind? What is the use case of this vs xefind? I know that xefind checks for data availability in another way, and relies on the runDB (that needs to be updated by the people that do reprocessing, usually its automatic so it works). I would stress maybe more on requesting that the DB is up to date rather than proposing another tool? Also, the good thing of xefind relying on the DB is that nothing can go wrong with the settings of environments, locally installed packages etc.. so you just know it should be there, then if you don't see it you start asking yourself what you are doing wrong. With this tool I am scared if people do something wrong (very easy, just imagine you have straxen locally installed and this tool fails) - you just straight away believe that the data is not even supposed to be there.
  • If we still want to propose this tool, I think some changes would be necessary. Mainly I think we should run the pytohn script that does the is_stored in a job. Why? Because if you submit a job you use all the tools we already have set up that take care of providing you with the right context, the right sorftware with singularity, the right cutax etc, and the right paths set to the storages etc. I am referring especially about the treatment of cutax and the paths and all. This is all not needed if submitting a job to singularity. I think can be dangerous to have it also here, it is another thing to keep up to date and again it will easily pass the message that data is not there when it should be there.

Let me know what you think about those points! Might be that I am wrong and if you still think you want to push this it's okay for me.

@GiovanniVolta
Copy link
Author

Thank you Carlo, your points are very valid and indeed this script is better to have it in another place rather than in utilix. Just for the record, I made this script to keep the data availability page up to date. I could not do it with xefind since there the tags are kind of hard coded.
You are right, this tool is not very solid and can create confusion if someone does not know very well how to manage local installation (and I count myself in this list). I will move the file to another place more suitable. thank you for taking the time to review it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants