Gradient accumulation and total number of training steps #13457
Answered
by
rohitgr7
konstantinjdobler
asked this question in
DDP / multi-GPU / multi-node
-
When using
|
Beta Was this translation helpful? Give feedback.
Answered by
rohitgr7
Jul 1, 2022
Replies: 1 comment
-
if you mean |
Beta Was this translation helpful? Give feedback.
0 replies
Answer selected by
konstantinjdobler
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
if you mean
max_steps=n
, then 1.