-
Notifications
You must be signed in to change notification settings - Fork 177
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Question regarding the calculation of the actor gradient. #2
Comments
Note that your are dividing twice through the batch size when computing the update for the actor weights. 2 This might be the cause of your bad performance. |
Hi, |
Still embedding this into a bigger project. I can probably give you feedback if its working or not in a week or so. |
The paper suggests using batch-norm ("We also report results with components of our algorithm (i.e. the target network or batch normalization) removed. In order to perform |
Have fixed bugs and added batch norm on the actor network! |
Hi,
why did you include the minus sign in the
grad_ys
argument of the bottom function?self.parameters_gradients = tf.gradients(self.action_output,self.parameters,-self.q_gradient_input/BATCH_SIZE)
As far as i understand
grad_ys
weights the gradients of each of the actor outputs with the the corresponding value (coming from the critic in your case).Thanks!
The text was updated successfully, but these errors were encountered: