Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Restrict saving of validation .csv's to wandb #280

Open
AUdaltsova opened this issue Nov 28, 2024 · 9 comments
Open

Restrict saving of validation .csv's to wandb #280

AUdaltsova opened this issue Nov 28, 2024 · 9 comments

Comments

@AUdaltsova
Copy link
Contributor

We currently have validation .csvs created and saved to wandb here and are quickly running out of storage space on wandb, because these files are saved each epoch for each run, and quickly become massive with horizon and quantile number increases. We should make these optional and cut down on the size/amount of files saved:

  • only save best epoch
  • only save the same 4(?) batches we save val plots for
  • any other way we can trim this?
@AUdaltsova
Copy link
Contributor Author

@felix-se-cat, tagging you here as this is now the main issue. I think we are set on making it optional and default to not saving, other things are a bit in the air still, I'll try to get that clarified soon. We would really appreciate help on this! Word of warning though, this might get trickier than it seems, so let me know if you still want to be assigned.

@felix-se-cat
Copy link

Happy to stay assigned to it! Will be a great as a driver to refresh a couple of things, if nothing else :)
If you could let me know once you've got a clearer idea of how to go forward, that'd be great!

@AUdaltsova
Copy link
Contributor Author

@felix-se-cat ok great, will assign you now and get back to you once we're sure what we think is best here! Thanks so much for jumping in

@AUdaltsova
Copy link
Contributor Author

Hi @felix-se-cat! We've had a discussion and we think it should default to not save and when turned on only save the best epoch.

@felix-se-cat
Copy link

Cool! I'll have a read through the code and will let you know if there are any questions.

@AUdaltsova
Copy link
Contributor Author

@felix-se-cat hi again, sorry I've actually realized it's going to be really, really hard for you to check your solution works, because we don't have any tests around this. That means we'll need to put your solution in, run a model, and check that nothing breaks, and I don't think that's a feasible way to do this, unfortunately this is just a really community-unfriendly issue. I'll un-assign you now. Thanks so much for volunteering though! Hopefully there's a better issue that piques your interest.

@felix-se-cat
Copy link

Would there be a way for me to run it locally, maybe?
If not, no problem either way!

@AUdaltsova
Copy link
Contributor Author

@felix-se-cat you can try running PVNet on your own if you really want to (open data pvnet project we have could be helpful, as the models we produce unfortunately use data we can't share, so cannot be run out of the box. But it is also very much a work in progress, and I won't be able to tell you how ready it is). It will have to be able to run at least some batches, and also be hooked up to wandb (the setup is all there, but you'll need to be registered and have some amount of storage, because it's going to save some medium to large csvs each epoch (same thing this issue is fixing)). In short, if you really, really want to do it, it's probably possible, but I wouldn't recommend it!

@felix-se-cat
Copy link

Ah cool! I'll have a play around with it if I can make the time!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants