how to proper log in training_step or on_training_batch_end in DDP #20098
Unanswered
huangfu170
asked this question in
DDP / multi-GPU / multi-node
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
I use 4 gpus in 1 nodes with a NLP task, I wonder to know, how to log the training loss at step level to the Tensorboard, I use the following code, but it didn't work and only output the loss at the end of epoch:` def training_step(self,batch,batch_idx):
input_ids, attention_mask, label, label_input_ids, label_attention_mask, edge_index, cp_input_ids, cp_attention_mask = self.unzip_batch(batch)
sim, outputs,,,, = self(input_ids, attention_mask, label_input_ids, label_attention_mask, edge_index, cp_input_ids, cp_attention_mask)
loss_sim = loss_function(sim, label)
loss_output = loss_function(outputs, label)
loss = loss_sim + loss_output
sim= (torch.sigmoid(sim[:,1:])>=0.8)
outputs=(torch.sigmoid(outputs[:,1:])>=0.8)
Beta Was this translation helpful? Give feedback.
All reactions