iam-reproduce

This repo is for the Influence-Aware Memory(IAM) architecture(https://arxiv.org/abs/1911.07643), based on the pytorch structure of https://github.com/ikostrikov/pytorch-a2c-ppo-acktr-gail and the paper's repo source https://github.com/INFLUENCEorg/influence-aware-memory

Run

To run for different scenarios, use the following codes

cd YOUR_PATH/IAM-Reproduce

Warehouse and Traffic Control

Run FNN8 (32)

python main.py --env-name warehouse --num-steps 8 --recurrent-policy --log-dir ./log_w/
python main.py --env-name traffic --num-steps 32 --recurrent-policy --num-env-steps 2000000 --num-processes 1 --log-dir ./log_tc/

Run FNN1 (10)

python main.py --env-name warehouse --num-steps 1 --recurrent-policy --log-dir ./log_w/
python main.py --env-name traffic --num-steps 10 --recurrent-policy --num-env-steps 2000000 --num-processes 1 --log-dir ./log_tc/

Run GRU only

python main.py --env-name warehouse --num-steps 8 --log-dir ./log_w/
python main.py --env-name traffic --num-steps 32 --num-env-steps 2000000 --num-processes 1 --log-dir ./log_tc/

Run IAM

python main.py --env-name warehouse --num-steps 8 --IAM --log-dir ./log_w/
python main.py --env-name traffic --num-steps 32 --IAM --num-env-steps 2000000 --num-processes 1 --log-dir ./log_tc/

NOTE:

To render the warehouse dynamics, alter the variable render_bool to True in warehouse.py, and run with just 1 processes(recommended, because all processes will pop out)
The log_xxxfolder will store the monitor files of all processes and a manually stored file mean_rewards_xxx.txt recording the mean rewards.

The results are saved in ./log (warehouse) and ./log_t (traffic), respectively. To visualize the results, run the following code. EWMA method is used to smooth the collected data.

python plot_results.py

Currently the results of mean rewards is like:

Atari

To run flicker Atari 'BreakoutNoFrameskip-v4' without flickering, use:

python main.py --env-name BreakoutNoFrameskip-v4 --num-env-steps 4000000 --num-steps 8 --lr 0.00025 --log-dir ./log_fa/ --IAM

The result:

To run flicker Atari 'BreakoutNoFrameskip-v4' with flickering, use:

python main.py --env-name BreakoutNoFrameskip-v4 --num-env-steps 4000000 --num-steps 32 --lr 0.00025 --log-dir ./log_fa/ --IAM --flicker

For the result with flickering, we are still testing.

Work

The work is customized in:

Embed three environments(warehouse, traffic control with sumo and Atari from Gym) in the pytorch structure
Design IAMModel.py for Influence-Aware Model, used with A2C
Visualize the result, and compare them with the original paper

Name		Name	Last commit message	Last commit date
Latest commit History 48 Commits
README.assets		README.assets
a2c_ppo_acktr		a2c_ppo_acktr
environments		environments
index.assets		index.assets
.gitignore		.gitignore
README.md		README.md
_config.yml		_config.yml
default_a2c_args.txt		default_a2c_args.txt
default_wh_params.yaml		default_wh_params.yaml
env_check.py		env_check.py
evaluation.py		evaluation.py
index.md		index.md
main.py		main.py
plot_results.py		plot_results.py
plot_results_single.py		plot_results_single.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

iam-reproduce

Run

Warehouse and Traffic Control

Atari

Work

About

Releases

Packages

Contributors 2

Languages

jcuic5/iam-reproduce

Folders and files

Latest commit

History

Repository files navigation

iam-reproduce

Run

Warehouse and Traffic Control

Atari

Work

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages