-
Notifications
You must be signed in to change notification settings - Fork 2.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
'--cpu' flag causes IndexError: list index out of range #68
Comments
I also experienced a similar problem, when all the other parameters setting each only have one variant, the number of cpu seems to make the number of variants. So cpu=4 will run 4 different variants, but each has the same hyperparameters. Should be a bug. |
Hi @sjbaines. This error occurs because of a particular interaction between batch size, episode length, and number of parallel processes. Each process tries to separately collect I am currently planning a patch for dealing with a backlog of issues, and I will try to include a fix for this in the patch. Nondeterminism is due to two causes: Python hash seed (which you need to set with an environment variable) and an error on my part for not setting env seeds. See #33 for details. A fix for this will also be included in the coming patch. @watchernyu I am not sure what you mean by running 4 different variants with the same hyperparameters for cpu=4. Can you clarify, and/or open a separate issue with minimal code to replicate? |
Sorry for the vague description, I think it might be a different issue, let me open a new one. |
|
I was running into this issue with PPO. Basically, the "steps_per_epoch" parameter must be at least as big as the max episode length of the environment. For CartPole-v1, this is 500, so the "steps_per_epoch" must be > 500. The default setting in test_ppo.py is 100, which is not enough. |
I just stumbled on this same problem and can confirm the reason is what @jachiam guessed above. I think, however, that the workaround suggested by @richardrl (thanks!) is ONLY correct for non-parallel training. The actual number of steps in the main loop is So what has worked for me (so far) is to set |
Run:
python -m spinup.run ppo --hid [32,32] --env LunarLander-v2 --exp_name installtest --gamma 0.999 --cpu 12 --seed 42
After a random number of epochs, 'IndexError: list index out of range' occurs:
File "/home/steve/spinningup/spinup/utils/logx.py", line 321, in log_tabular
vals = np.concatenate(v) if isinstance(v[0], np.ndarray) and len(v[0].shape)>0 else v
IndexError: list index out of range
Despite passing --seed, this is not deterministic, but always seems to happen within the first ~20 epochs.
The problem appears to be that v is [], hence the attempt to access v[0] fails.
--cpu auto also has the problem
Only --cpu 1 seems to be safe.
The text was updated successfully, but these errors were encountered: