Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

'--cpu' flag causes IndexError: list index out of range #68

Open
sjbaines opened this issue Dec 7, 2018 · 6 comments
Open

'--cpu' flag causes IndexError: list index out of range #68

sjbaines opened this issue Dec 7, 2018 · 6 comments

Comments

@sjbaines
Copy link

sjbaines commented Dec 7, 2018

Run:
python -m spinup.run ppo --hid [32,32] --env LunarLander-v2 --exp_name installtest --gamma 0.999 --cpu 12 --seed 42

After a random number of epochs, 'IndexError: list index out of range' occurs:
File "/home/steve/spinningup/spinup/utils/logx.py", line 321, in log_tabular
vals = np.concatenate(v) if isinstance(v[0], np.ndarray) and len(v[0].shape)>0 else v
IndexError: list index out of range

Despite passing --seed, this is not deterministic, but always seems to happen within the first ~20 epochs.
The problem appears to be that v is [], hence the attempt to access v[0] fails.

--cpu auto also has the problem
Only --cpu 1 seems to be safe.

@watchernyu
Copy link

I also experienced a similar problem, when all the other parameters setting each only have one variant, the number of cpu seems to make the number of variants. So cpu=4 will run 4 different variants, but each has the same hyperparameters. Should be a bug.

@jachiam
Copy link
Contributor

jachiam commented Dec 13, 2018

Hi @sjbaines. This error occurs because of a particular interaction between batch size, episode length, and number of parallel processes.

Each process tries to separately collect int(batch_size / n_cpu) steps of interactions with the environment. If this is shorter than a single episode, the logger gets an empty list for EpRet or EpLen in that process, and then complains when it tries to aggregate stats across processes. That is the nature of the error. (And it doesn't happen in the first few epochs because the RL agent is so bad at playing, that its episodes terminate early and never get that long.) As long as you pick n_cpu and batch_size so that the batch size per process is always greater than one episode length, you will not get this error.

I am currently planning a patch for dealing with a backlog of issues, and I will try to include a fix for this in the patch.

Nondeterminism is due to two causes: Python hash seed (which you need to set with an environment variable) and an error on my part for not setting env seeds. See #33 for details. A fix for this will also be included in the coming patch.

@watchernyu I am not sure what you mean by running 4 different variants with the same hyperparameters for cpu=4. Can you clarify, and/or open a separate issue with minimal code to replicate?

@watchernyu
Copy link

Sorry for the vague description, I think it might be a different issue, let me open a new one.

@menip
Copy link

menip commented Dec 19, 2018

python -m spinup.run ppo --env Breakout-ram-v0 --exp_name test --num_cpu auto
Gives same IndexError as described

@richardrl
Copy link

richardrl commented Apr 9, 2021

I was running into this issue with PPO. Basically, the "steps_per_epoch" parameter must be at least as big as the max episode length of the environment. For CartPole-v1, this is 500, so the "steps_per_epoch" must be > 500.

The default setting in test_ppo.py is 100, which is not enough.

@Alberto-Hache
Copy link

Alberto-Hache commented Apr 5, 2022

I just stumbled on this same problem and can confirm the reason is what @jachiam guessed above.

I think, however, that the workaround suggested by @richardrl (thanks!) is ONLY correct for non-parallel training. The actual number of steps in the main loop is local_steps_per_epoch (which is not steps_per_epoch, but 'local_steps_per_epoch', calculated as local_steps_per_epoch = int(steps_per_epoch / num_procs())). If you set steps_per_epoch at 4000 and `--num_cpu' is say 4, each of the four loops will run for only 1000 steps.

So what has worked for me (so far) is to set steps_per_epoch to, at least, max_ep_len times num_procs(), e.g.:
max_ep_len = 2000
--cpu = 4
steps_per_epoch >= 8000

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants