Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missing file from cellpainting-gallery causes errors within tests #500

Open
d33bs opened this issue Jan 28, 2025 · 6 comments
Open

Missing file from cellpainting-gallery causes errors within tests #500

d33bs opened this issue Jan 28, 2025 · 6 comments
Labels
bug Something isn't working

Comments

@d33bs
Copy link
Member

d33bs commented Jan 28, 2025

I noticed that there are tests which rely on a file from the cellpainting-gallery which now appears to be missing: s3://cellpainting-gallery/cpg0016-jump/source_4/workspace/load_data_csv/2021_08_23_Batch12/BR00126114/load_data_with_illum.parquet. When looking in the related object path I don't see that there's a Parquet file anymore at that location. This change must have been recent (within the last week or so) as tests were passing before now (to my knowledge).

@d33bs d33bs added the bug Something isn't working label Jan 28, 2025
@d33bs
Copy link
Member Author

d33bs commented Jan 29, 2025

@d33bs
Copy link
Member Author

d33bs commented Jan 29, 2025

Hi @shntnu - we noticed there might have been a change to the JUMP data within the Cell Painting Gallery on S3. The change pertains to a missing Parquet file which is a dependency for a Pycytominer test. Would you have any guidance on how we should proceed?

@shntnu
Copy link
Member

shntnu commented Jan 29, 2025

Sorry for the annoyance. We had recently restructured load_data_csv to include brightfield images, then moved the old ones (without brightfield) to load_data_csv_orig, so you'd need to update the URL to s3://cellpainting-gallery/cpg0016-jump/source_4/workspace/load_data_csv_orig/2021_08_23_Batch12/BR00126114/load_data_with_illum.parquet

I wish we could guarantee a permalink, but that's tricky. Instead, .e try to avoid breaking changes but this one was unavoidable :D

Note that CSV files are available at that location (s3://cellpainting-gallery/cpg0016-jump/source_4/workspace/load_data_csv/2021_08_23_Batch12/BR00126114/load_data_with_illum.csv) just not Parquet files.

cc @ErinWeisbart
Internal issue: https://github.com/jump-cellpainting/datasets-private/issues/31#issuecomment-2620232263

@d33bs
Copy link
Member Author

d33bs commented Jan 30, 2025

Thanks @shntnu and no worries at all! I'll plan to circle to the Cell Locations functionality in Pycytominer and enable CSV compatibility with the CSV file you referenced (please correct me if I'm wrong on the assumptions here).

@shntnu
Copy link
Member

shntnu commented Jan 30, 2025

Thanks @shntnu and no worries at all! I'll plan to circle to the Cell Locations functionality in Pycytominer and enable CSV compatibility with the CSV file you referenced (please correct me if I'm wrong on the assumptions here).

Note that the CSV file does not have cell locations information. The simplest fixe is to replace load_data_csv with load_data_csv_orig in your paths and everything should work

@d33bs
Copy link
Member Author

d33bs commented Jan 31, 2025

Thanks so much @shntnu I'll give that a try!

d33bs added a commit to d33bs/pycytominer that referenced this issue Jan 31, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants