You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Thank you for your contributions in pretraining. You trained the encoder for 12.5K steps for each domain in pretraining phase before applying the encoder to supervised downstream tasks. Is it possible that the checkpoints that are most suitable for downstream tasks might appear in the middle of the pretraining phase? This phenomenon is obvious in many real applications. Under the circumstances, we might not know if it is a good choice to directly train the model to the maximum step with all of the corpus and take the final checkpoint. Is there any suggestion on that?
The text was updated successfully, but these errors were encountered:
Thank you for your contributions in pretraining. You trained the encoder for 12.5K steps for each domain in pretraining phase before applying the encoder to supervised downstream tasks. Is it possible that the checkpoints that are most suitable for downstream tasks might appear in the middle of the pretraining phase? This phenomenon is obvious in many real applications. Under the circumstances, we might not know if it is a good choice to directly train the model to the maximum step with all of the corpus and take the final checkpoint. Is there any suggestion on that?
The text was updated successfully, but these errors were encountered: