RewardScaler #4

NikEyX · 2019-07-15T18:44:01Z

I noticed the below:

class RewardScaler(gym.RewardWrapper):
    """
    Bring rewards to a reasonable scale for PPO. This is incredibly important
    and effects performance a lot.
    """
    def reward(self, reward):
        return reward * 0.01

Is there a good explanation for this? I would have thought it should converge regardless? Why does it affect convergence speed so much? How did you even find this to be a problem? How can I know whether my rewards (on a different game) are too large or need to be scaled, are there good indicators for that besides performance (which could be affected by any parameters)

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RewardScaler #4

RewardScaler #4

NikEyX commented Jul 15, 2019

RewardScaler #4

RewardScaler #4

Comments

NikEyX commented Jul 15, 2019