The random seed doesn't work #33

xffxff · 2018-11-13T04:39:13Z

Even if I set the same random seed, the result is different, and you can test it on ddpg. I think tf.set_random_seed(seed) doesn't work, but I don't know how to solve it.

The text was updated successfully, but these errors were encountered:

machinaut · 2018-11-13T19:05:41Z

Can you give an example with what you expected to happen and what actually happened?

If tensorflow is failing to set a set, maybe you should file a bug on tensorflow.

For gym environments and numpy (for example) are seeded differently, and you should not expect seeding tensorflow to affect either of them.

jachiam · 2018-11-13T19:25:18Z

Ooh, I may not have seeded Gym envs. My bad. Will look into getting this working---it's possible that Gym isn't the only other source of nondeterminism (they can be hard to track down).

xffxff · 2018-11-14T00:06:03Z

@machinaut @jachiam ,I run the common python ddpg.py -s 2, the first time I got AverageTestEpRet=-594 of epoch one,but the second time I got the AverageTestRet=-314 of epoch one. And I tried to set

env.seed(seed)
test_env.seed(seed)

but the AverageTestEpRet of epoch one was still different.

jachiam · 2018-11-15T09:01:05Z

@xffxff, hmm, that's a bit unfortunate. I appreciate that you tried this out.

I am not fully sure what could be going wrong here. My suspicion is that it might involve the Python hash seed used to prevent dict collision attacks. See here for an explanation of the issue, and see here for more info. Can you try export PYTHONHASHSEED=0 and then try running your experiment again?

jachiam · 2018-11-19T03:51:29Z

@xffxff I tried out export PYTHONHASHSEED=0, in addition to setting env seed and test_env seed, and did two runs of DDPG with python -m spinup.run ddpg --hid [32] --env HalfCheetah-v2 --steps_per_epoch 1000.

Results from first run:

---------------------------------------
|             Epoch |               1 |
|      AverageEpRet |            -260 |
|          StdEpRet |               0 |
|          MaxEpRet |            -260 |
|          MinEpRet |            -260 |
|  AverageTestEpRet |            -533 |
|      StdTestEpRet |            3.39 |
|      MaxTestEpRet |            -525 |
|      MinTestEpRet |            -536 |
|             EpLen |           1e+03 |
|         TestEpLen |           1e+03 |
| TotalEnvInteracts |           1e+03 |
|      AverageQVals |           0.508 |
|          StdQVals |            1.17 |
|          MaxQVals |            5.88 |
|          MinQVals |           -6.88 |
|            LossPi |           -1.39 |
|             LossQ |           0.964 |
|              Time |            6.86 |
---------------------------------------

Results from second run:

---------------------------------------
|             Epoch |               1 |
|      AverageEpRet |            -260 |
|          StdEpRet |               0 |
|          MaxEpRet |            -260 |
|          MinEpRet |            -260 |
|  AverageTestEpRet |            -533 |
|      StdTestEpRet |            4.89 |
|      MaxTestEpRet |            -526 |
|      MinTestEpRet |            -541 |
|             EpLen |           1e+03 |
|         TestEpLen |           1e+03 |
| TotalEnvInteracts |           1e+03 |
|      AverageQVals |           0.508 |
|          StdQVals |            1.17 |
|          MaxQVals |            5.88 |
|          MinQVals |           -6.88 |
|            LossPi |           -1.39 |
|             LossQ |           0.964 |
|              Time |              16 |
---------------------------------------

Looks like this solves the issue. I'm going to mark this as closed.

jachiam · 2018-11-19T11:38:27Z

Scratch that, double-checking and it looks like things diverge after Epoch 1. Don't know where this nondeterminism is coming from.

jachiam · 2018-11-19T12:14:13Z

With env seed setting and export PYTHONHASHSEED=0, TRPO/PPO/VPG are deterministic through at least the first three epochs. I have no idea why DDPG would be different.

Skalwalker · 2020-07-01T16:00:46Z

Is this issue happening in both tensorflow and pytorch? Have the operation-level seeds been properly set?

Jianengzhang · 2020-11-12T08:13:16Z

With following env seed setting

    env.seed(seed)
    env.action_space.seed(seed)
    test_env.seed(seed)
    test_env.action_space.seed(seed)

the result is same in same seed.

jachiam closed this as completed Nov 19, 2018

jachiam reopened this Nov 19, 2018

jachiam mentioned this issue Dec 13, 2018

'--cpu' flag causes IndexError: list index out of range #68

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The random seed doesn't work #33

The random seed doesn't work #33

xffxff commented Nov 13, 2018

machinaut commented Nov 13, 2018

jachiam commented Nov 13, 2018

xffxff commented Nov 14, 2018

jachiam commented Nov 15, 2018

jachiam commented Nov 19, 2018

jachiam commented Nov 19, 2018

jachiam commented Nov 19, 2018

Skalwalker commented Jul 1, 2020

Jianengzhang commented Nov 12, 2020

The random seed doesn't work #33

The random seed doesn't work #33

Comments

xffxff commented Nov 13, 2018

machinaut commented Nov 13, 2018

jachiam commented Nov 13, 2018

xffxff commented Nov 14, 2018

jachiam commented Nov 15, 2018

jachiam commented Nov 19, 2018

jachiam commented Nov 19, 2018

jachiam commented Nov 19, 2018

Skalwalker commented Jul 1, 2020

Jianengzhang commented Nov 12, 2020