Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable dataset cache requests based source bag #111

Merged
merged 7 commits into from
Sep 10, 2024

Conversation

RayPlante
Copy link
Collaborator

The cache management API allows an admin to request either specific data files or whole data sets to be cached. This proved limiting when caching very large datasets if any large part of the caching failed, the only practical remedy was to try caching the whole dataset again. This PR addresses this limitation by allowing one to request caching of all files from a particular bag. This provides more granularity in a request but which is more efficient than requesting individual files.

This change was implemented in support of mds2-2775, dataset of 10^4 files.

@RayPlante
Copy link
Collaborator Author

Tested in integrated system.

@RayPlante RayPlante merged commit f4756cd into integration Sep 10, 2024
2 checks passed
@RayPlante RayPlante deleted the feature/cache-by-bag branch September 10, 2024 19:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant