Understanding the cost calculation for the quadrotor #203
-
|
Hello, I am currently trying to understand the exact calculation of cost for the Quadrotor. I am unable to recreate the calculation, using the trajectory data I get from the executed experiment. If I run the 3D Quadrotor PID trajectory tracking example with the following code added at the end of the run function traj = trajs_data["obs"][0]
goal = env.X_GOAL
act = trajs_data["action"][0]
error = 0
for i in range(len(traj)):
x = traj[i]
x_ref = goal[i]
error -= env.Q[0,0] * (0.5 * ((x - x_ref).T @ (x - x_ref)))
for i in range(len(act)):
u = act[i]
u = np.clip(u, env.physical_action_bounds[0], env.physical_action_bounds[1])
error -= env.R[0,0] * (0.5 * ((u - env.U_GOAL).T @ (u - env.U_GOAL)))
print(f"Self-calculated error {error} vs. avg. return {metrics['average_return']}")The output is So I seem to be missing something. I have no experience with CasADi, so I am struggling to disentangle what the difference is from code 😅 Could you give me a pointer? |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 1 reply
-
|
Hi @Kilidsch, thank you for using our gym! It seems that you have found a huge error, I am surprised we have never found it. Perhaps something changed in one of our dependencies. Regardless, I have opened a PR to fix it here: #204. Thank you so much for finding this and letting us know! We will try to merge this PR ASAP. |
Beta Was this translation helpful? Give feedback.
Hi @Kilidsch, thank you for using our gym! It seems that you have found a huge error, I am surprised we have never found it. Perhaps something changed in one of our dependencies. Regardless, I have opened a PR to fix it here: #204. Thank you so much for finding this and letting us know! We will try to merge this PR ASAP.