You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I don't know how to draw the score curve PPO which in the paper of PPO? How to deal with the situation when the game is not over but the sample pool is full? In this cause if we end the game, it means that we cannot calculate the score which agent needs to perform more than Horizon (T) interactions with the environment to get, such as Walker2d-v1. But if we don't end the game when the sample pool is full, the number of samples maybe larger than Horizon (T).
I don't know how to deal with this problem. How do you deal with this problem in the PPO experiment? I really care about this. Thanks for your help.
The text was updated successfully, but these errors were encountered:
Hello,
I don't know how to draw the score curve PPO which in the paper of PPO? How to deal with the situation when the game is not over but the sample pool is full? In this cause if we end the game, it means that we cannot calculate the score which agent needs to perform more than Horizon (T) interactions with the environment to get, such as Walker2d-v1. But if we don't end the game when the sample pool is full, the number of samples maybe larger than Horizon (T).
I don't know how to deal with this problem. How do you deal with this problem in the PPO experiment? I really care about this. Thanks for your help.
The text was updated successfully, but these errors were encountered: