Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scaling validation to large datasets #436

Open
jn2clark opened this issue Feb 19, 2023 · 2 comments
Open

Scaling validation to large datasets #436

jn2clark opened this issue Feb 19, 2023 · 2 comments

Comments

@jn2clark
Copy link

Hi, I had a quick question about the validation. I seem to be running into memory issues with my validation set. After checking the code this seems to be expected (https://github.com/mlfoundations/open_clip/blob/main/src/training/train.py#L251). Just wondering if any work has been done here before I try and implement something? What would the ideal approach be? I was thinking of just batching the validation data and reporting the metrics from each batch.

@mehdidc
Copy link
Contributor

mehdidc commented Feb 19, 2023

@jn2clark there is a PR on distributed validation that could help #176, it is still not finished

@jn2clark
Copy link
Author

Thanks @mehdidc ! I will take a look

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants