Skip to content

Use more sophisticated checkpoint naming scheme #3

@pgagarinov

Description

@pgagarinov

The default checkpoint file naming scheme uses only epoch number and step number as keys for making checkpoint files names different for different epochs/steps. Such naming scheme is not sufficient when many checkpoints are created withing the same notebook for different models (or same models but with different hyper parameters). We should an adaptive naming scheme that accounts for

  • (optionally) Jupyter notebook name
  • Model class name or/and experiment id
  • run id (needed when the same model is trained multiple times)

We should also incorporate a warning mechanics that warns a user about the checkpoint directory growing to much due to containing too many outdated checkpoints

Metadata

Metadata

Assignees

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions