Replies: 2 comments 10 replies
-
|
We have an internal issue tracker, and the oldest issue I opened on it that is still open (before we even opened the source code of refiners) is: "Find a way to avoid changing the state dict when we add blocks without weights" :) So yes this is something we have been discussing a lot internally. We don't really have a perfect solution for this now and there are always workarounds:
But if you have ideas it could be great! Ideally I'd like the solution to also work when we e.g. insert a chain in a model for clarity / easier targeting, and to keep somewhat semantics keys (i.e. not just use for instance ordered keys named 0001 0002 etc, which works but had other issues). |
Beta Was this translation helpful? Give feedback.
-
|
The big fundamental question is, should the keys of the state_dict be human-readable? |
Beta Was this translation helpful? Give feedback.

Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Hello refiners,
I'm experimenting with trainer and especially I'm facing a problem to load/save models weights
The sequence of the trainer is :
trainer.prepare_modelsis loading the checkpoint on a non-injected modelon_train_beginis injecting the dropout_adapteron_checkpoint_saveis saving the checkpoint (usingmodel.state_dict())The named of the Dropout-impacted layers are changed in step 2.
As a result, the model saved in
on_checkpoint_saveare not compatible with the loading intrainer.prepare_models, and i cannot smootly save/load the model.Toy example
The injection of the dropout adpater is changing the keys of weights in
state_dict()is outputing
What i'm not clear is what is the target behavior
A. should
.inject(parent)change the name of the weights and we should fix the save/load sequence in the trainer ?B. should
.inject(parent)not change the name of the weights instate_dict()when the adapter is not injecting new weights ?I can help on this if needed
Beta Was this translation helpful? Give feedback.
All reactions