Run Discrete Benchmark

Step 1: Setup MuJoCo (skip it if you have gone through the virtual environment tutorial)

To run the virtual environment, you need to set up MuJoCo.

Download the MuJoCo version 2.1 binaries for Linux or OSX.
Extract the downloaded mujoco210 directory into ~/.mujoco/mujoco210.
Install and use mujoco-py.

pip install -U 'mujoco-py<2.2,>=2.1'
pip install -e ./mujuco_environment

We highly recommend you to ensure the MuJoCo is indeed working by running testing examples in mujoco-py. In most case, you need to run:

import mujoco_py
import os
mj_path = mujoco_py.utils.discover_mujoco()
xml_path = os.path.join(mj_path, 'model', 'humanoid.xml')
model = mujoco_py.load_model_from_path(xml_path)
sim = mujoco_py.MjSim(model)

Step 2 (optionally): Train expert agents.

Note that We have a total of 4 different settings.

# step in the dir containing the "main" files.
cd ./interface/

# run PPO-Lag knowing the ground-truth
python train_policy.py ../config/mujoco_WGW-v0/train_ppo_lag_WGW-v0-setting1.yaml -n 5 -s 123
python train_policy.py ../config/mujoco_WGW-v0/train_ppo_lag_WGW-v0-setting2.yaml -n 5 -s 123
python train_policy.py ../config/mujoco_WGW-v0/train_ppo_lag_WGW-v0-setting3.yaml -n 5 -s 123
python train_policy.py ../config/mujoco_WGW-v0/train_ppo_lag_WGW-v0-setting4.yaml -n 5 -s 123

Step 3: Run the ICLR algorithms with NN approximated the constraint funtion.

Note that:

This is to reproduce the results in the Section 6.2 of our paper.
We have a total of 4 different settings.
Random seeds are not required since the environments and models are deterministic.

# step in the dir containing the "main" files.
cd ./interface/

# run GACL
python train_gail.py ../config/mujoco_WGW-v0/train_GAIL_WGW-v0-setting1.yaml -n 5
python train_gail.py ../config/mujoco_WGW-v0/train_GAIL_WGW-v0-setting2.yaml -n 5
python train_gail.py ../config/mujoco_WGW-v0/train_GAIL_WGW-v0-setting3.yaml -n 5
python train_gail.py ../config/mujoco_WGW-v0/train_GAIL_WGW-v0-setting4.yaml -n 5

# run BC2L
python train_icrl.py ../config/mujoco_WGW-v0/train_Binary_WGW-v0-setting1.yaml -n 5
python train_icrl.py ../config/mujoco_WGW-v0/train_Binary_WGW-v0-setting2.yaml -n 5
python train_icrl.py ../config/mujoco_WGW-v0/train_Binary_WGW-v0-setting3.yaml -n 5 
python train_icrl.py ../config/mujoco_WGW-v0/train_Binary_WGW-v0-setting4.yaml -n 5

# run MECL
python train_icrl.py ../config/mujoco_WGW-v0/train_ICRL_WGW-v0-setting1.yaml -n 5
python train_icrl.py ../config/mujoco_WGW-v0/train_ICRL_WGW-v0-setting2.yaml -n 5
python train_icrl.py ../config/mujoco_WGW-v0/train_ICRL_WGW-v0-setting3.yaml -n 5
python train_icrl.py ../config/mujoco_WGW-v0/train_ICRL_WGW-v0-setting4.yaml -n 5

# run VICRL
python train_icrl.py ../config/mujoco_WGW-v0/train_VICRL_WGW-v0-setting1.yaml -n 5
python train_icrl.py ../config/mujoco_WGW-v0/train_VICRL_WGW-v0-setting2.yaml -n 5
python train_icrl.py ../config/mujoco_WGW-v0/train_VICRL_WGW-v0-setting3.yaml -n 5
python train_icrl.py ../config/mujoco_WGW-v0/train_VICRL_WGW-v0-setting4.yaml -n 5

Step 3.5: (optionally) Run the ICLR algorithms with the constraint set.

python train_icrl.py ../config/mujoco_WGW-discrete-v0/train_ICRL_discrete_WGW-v0-setting1.yaml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

discrete_env_tutorial.md

discrete_env_tutorial.md

Run Discrete Benchmark

Step 1: Setup MuJoCo (skip it if you have gone through the virtual environment tutorial)

Step 2 (optionally): Train expert agents.

Step 3: Run the ICLR algorithms with NN approximated the constraint funtion.

Step 3.5: (optionally) Run the ICLR algorithms with the constraint set.

Files

discrete_env_tutorial.md

Latest commit

History

discrete_env_tutorial.md

File metadata and controls

Run Discrete Benchmark

Step 1: Setup MuJoCo (skip it if you have gone through the virtual environment tutorial)

Step 2 (optionally): Train expert agents.

Step 3: Run the ICLR algorithms with NN approximated the constraint funtion.

Step 3.5: (optionally) Run the ICLR algorithms with the constraint set.