Question regarding the calculation of the actor gradient. #2

Tomakko · 2016-05-30T09:40:27Z

Hi,

why did you include the minus sign in the grad_ys argument of the bottom function?

self.parameters_gradients = tf.gradients(self.action_output,self.parameters,-self.q_gradient_input/BATCH_SIZE)

As far as i understand grad_ys weights the gradients of each of the actor outputs with the the corresponding value (coming from the critic in your case).

Thanks!

The text was updated successfully, but these errors were encountered:

Tomakko · 2016-06-03T15:52:36Z

Note that your are dividing twice through the batch size when computing the update for the actor weights.
1 q_gradient_batch = self.critic_network.gradients(state_batch,action_batch_for_gradients)/BATCH_SIZE in ddpg.py

2 self.parameters_gradients = tf.gradients(self.action_output,self.parameters,-self.q_gradient_input/BATCH_SIZE) in actor.py

This might be the cause of your bad performance.

GeremWD · 2016-06-05T09:42:02Z

Hi,
Did you manage to improve the performances by correcting this ? Because even if i do not see any other error in the code, it does not work.

Tomakko · 2016-06-05T13:11:03Z

Still embedding this into a bigger project. I can probably give you feedback if its working or not in a week or so.
Could you elaborate what it not working out? Does the networks converge?
Btw: Note that in the original paper the critic has a learning rate of 0.001 while the author is using 0.0001.

doomie · 2016-06-07T22:45:06Z

The paper suggests using batch-norm ("We also report results with components of our algorithm (i.e. the target network or batch normalization) removed. In order to perform
well across all tasks, both of these additions are necessary. "). See Figure 2 in https://arxiv.org/pdf/1509.02971.pdf

floodsung · 2016-09-01T13:21:56Z

Have fixed bugs and added batch norm on the actor network!

floodsung closed this as completed Sep 26, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question regarding the calculation of the actor gradient. #2

Question regarding the calculation of the actor gradient. #2

Tomakko commented May 30, 2016

Tomakko commented Jun 3, 2016

GeremWD commented Jun 5, 2016

Tomakko commented Jun 5, 2016 •

edited

Loading

doomie commented Jun 7, 2016

floodsung commented Sep 1, 2016

Question regarding the calculation of the actor gradient. #2

Question regarding the calculation of the actor gradient. #2

Comments

Tomakko commented May 30, 2016

Tomakko commented Jun 3, 2016

GeremWD commented Jun 5, 2016

Tomakko commented Jun 5, 2016 • edited Loading

doomie commented Jun 7, 2016

floodsung commented Sep 1, 2016

Tomakko commented Jun 5, 2016 •

edited

Loading