Merge latest commit

notadamking · Jul 6, 2019 · f06dd8f · f06dd8f
2 parents 80a55cf + c77f0c5
commit f06dd8f
Show file tree

Hide file tree

Showing 19 changed files with 263 additions and 86 deletions.
diff --git a/.dockerignore b/.dockerignore
@@ -5,4 +5,5 @@ research
 tensorboard
 agents
 data/tensorboard
-data/agents
+data/agents
+data/postgres
diff --git a/.gitignore b/.gitignore
@@ -2,6 +2,7 @@
 **/__pycache__
 data/tensorboard/*
 data/agents/*
+data/postgres/*
 data/log/*
 *.pkl
 *.db
diff --git a/README.md b/README.md
@@ -21,28 +21,123 @@ https://towardsdatascience.com/using-reinforcement-learning-to-trade-bitcoin-for
 
 # Getting Started
 
-The first thing you will need to do to get started is install the requirements in `requirements.txt`.
+### How to find out if you have nVIDIA GPU?
 
+Linux:
 ```bash
+sudo lspci | grep -i --color 'vga\|3d\|2d' | grep -i nvidia
+```
+If this returns anything, then you should have an nVIDIA card.
+
+### Basic usage
+
+The first thing you will need to do to get started is install the requirements. If your system has an nVIDIA GPU that you should start by using:
+
+```bash
+cd "path-of-your-cloned-rl-trader-dir"
 pip install -r requirements.txt
 ```
+More information regarding how you can take advantage of your GPU while using docker: https://github.com/NVIDIA/nvidia-docker
+
+
+If you have another type of GPU or you simply want to use your CPU, use:
+
+```bash
+pip install -r requirements.no-gpu.txt
+```
+
+Update your current static files, that are used by default:
+```bash
+ python ./cli.py update-static
+ ```
+
+Afterwards you can simply see the currently available options:
+
+```bash
+python ./cli.py --help
+```
+
+or simply run the project with default options:
+
+```bash
+python ./cli.py opt-train-test
+```
+
+If you have a standard set of configs you want to run the trader against, you can specify a config file to load configuration from. Rename config/config.ini.dist to config/config.ini and run 
+
+```bash
+python ./cli.py --from-config config/config.ini opt-train-test
+```
+
+```bash
+python ./cli.py opt-train-test
+```
+
+### Testing with vagrant
+
+Start the vagrant box using:
+```bash
+vagrant up
+```
+
+Code will be located at /vagrant. Play and/or test with whatever package you wish.
+Note: With vagrant you cannot take full advantage of your GPU, so is mainly for testing purposes
+
+
+### Testing with docker
+
+If you want to run everything within a docker container, then just use:
+```bash
+./run-with-docker (cpu|gpu) (yes|no) opt-train-test
+```
+- cpu - start the container using CPU requirements
+- gpu - start the container using GPU requirements
+- yes | no - start or not a local postgres container
+Note: in case using yes as second argument, use 
+
+```bash
+python ./ cli.py --params-db-path "postgres://rl_trader:rl_trader@localhost" opt-train-test
+```
+
+The database and it's data are pesisted under `data/postgres` locally.
+
+If you want to spin a docker test environment:
+```bash
+./run-with-docker (cpu|gpu) (yes|no)
+```
+
+If you want to run existing tests, then just use:
+```bash
+./run-tests-with-docker
+```
+
+# Fire up a local docker dev environment
+```bash
+./dev-with-docker
+```
 
-The requirements include the `tensorflow-gpu` library, though if you do not have access to a GPU, you should replace this requirement with `tensorflow`.
 
 # Optimizing, Training, and Testing
 
 While you could just let the agent train and run with the default PPO2 hyper-parameters, your agent would likely not be very profitable. The `stable-baselines` library provides a great set of default parameters that work for most problem domains, but we need to better.
 
-To do this, you will need to run `optimize.py`.
+To do this, you will need to run `cli.py`.
 
 ```bash
-python ./optimize.py
+python ./cli.py opt-train-test
 ```
 
 This can take a while (hours to days depending on your hardware setup), but over time it will print to the console as trials are completed. Once a trial is completed, it will be stored in `./data/params.db`, an SQLite database, from which we can pull hyper-parameters to train our agent.
 
 From there, agents will be trained using the best set of hyper-parameters, and later tested on completely new data to verify the generalization of the algorithm.
 
+# Common troubleshooting
+
+##### The specified module could not be found.
+Normally this is caused by missing mpi module. You should install it according to your platorm.
+- Windows: https://docs.microsoft.com/en-us/message-passing-interface/microsoft-mpi
+- Linux/MacOS: https://www.mpich.org/downloads/
+
 # Project Roadmap
 
 If you would like to contribute, here is the roadmap for the future of this project. To assign yourself to an item, please create an Issue/PR titled with the item from below and I will add your name to the list.
@@ -59,14 +154,10 @@ If you would like to contribute, here is the roadmap for the future of this proj
   - ~Optionally replace SQLite db with Postgres to enable multi-processed Optuna training~
     - This is enabled through Docker, though support for Postgres still needs to be improved
   - ~Replace `DummyVecEnv` with `SubProcVecEnv` everywhere throughout the code~ **[@archenroot, @arunavo4, @notadamking]**
-- Find sources of CPU bottlenecks to improve GPU utilization
-  - Replace pandas or improve speed of pandas methods by taking advantage of GPU
-- Find source of possible memory leak (in `RLTrader.optimize`) and squash it
-
+ - Allow features to be added/removed at runtime
+   - Create simple API for turning off default features (e.g. prediction, indicators, etc.)
+   - Create simple API for adding new features to observation space
 ## Stage 1:
-- Allow features to be added/removed at runtime
-  - Create simple API for turning off default features (e.g. prediction, indicators, etc.)
-  - Create simple API for adding new features to observation space
 - Add more optional features to the feature space
   - Other exchange pair data (e.g. LTC/USD, ETH/USD, EOS/BTC, etc.)
   - Twitter sentiment analysis
@@ -90,6 +181,9 @@ If you would like to contribute, here is the roadmap for the future of this proj
 - Experiment with Auto-decoders to remove noise from the observation space
 - Implement self-play in a multi-process environment to improve model exploration
   - Experiment with dueling actors vs tournament of dueling agents
+- Find sources of CPU bottlenecks to improve GPU utilization
+  - Replace pandas or improve speed of pandas methods by taking advantage of GPU
+- Find source of possible memory leak (in `RLTrader.optimize`) and squash it
 
 # Contributing
 

diff --git a/Vagrantfile b/Vagrantfile
@@ -52,7 +52,7 @@ Vagrant.configure(VAGRANTFILE_API_VERSION) do |config|
     vm_config.vm.synced_folder '.', '/vagrant', disabled: false
 vm_config.vm.provision "default setup", type: "shell", inline: <<SCRIPT
 apt update
-apt install mpich
+apt install mpich libpq-dev
 DEBIAN_FRONTEND=noninteractive apt install python3-pip
 pip3 install -r /vagrant/requirements.no-gpu.txt
 SCRIPT

diff --git a/cli.py b/cli.py
@@ -1,23 +1,36 @@
 import numpy as np
-
+from deco import concurrent
 from lib.RLTrader import RLTrader
 from lib.cli.RLTraderCLI import RLTraderCLI
 from lib.util.logger import init_logger
+from update_data import download_async
 
 np.warnings.filterwarnings('ignore')
 trader_cli = RLTraderCLI()
 args = trader_cli.get_args()
 
+
+@concurrent(processes=args.proc_number)
+def run_concurrent_optimize(trader: RLTrader, args):
+    trader.optimize(args.trials, args.trials, args.parallel_jobs)
+
+
 if __name__ == '__main__':
     logger = init_logger(__name__, show_debug=args.debug)
     trader = RLTrader(**vars(args), logger=logger)
 
     if args.command == 'optimize':
-        trader.optimize(n_trials=args.trials, n_parallel_jobs=args.parallel_jobs)
+        run_concurrent_optimize(trader, args)
     elif args.command == 'train':
         trader.train(n_epochs=args.epochs)
     elif args.command == 'test':
         trader.test(model_epoch=args.model_epoch, should_render=args.no_render)
     elif args.command == 'opt-train-test':
-        trader.optimize(args.trials, args.parallel_jobs)
-        trader.train(n_epochs=args.train_epochs, test_trained_model=args.no_test, render_trained_model=args.no_render)
+        run_concurrent_optimize(trader, args)
+        trader.train(
+            n_epochs=args.train_epochs,
+            test_trained_model=args.no_test,
+            render_trained_model=args.no_render
+        )
+    elif args.command == 'update-static':
+        download_async()
diff --git a/config/config.ini.dist b/config/config.ini.dist
@@ -0,0 +1,2 @@
+[Defaults]
+mini-batches=11
diff --git a/dev-with-docker b/dev-with-docker
@@ -0,0 +1,68 @@
+#!/usr/bin/env bash
+
+set -e
+
+SCRIPT_DIR=$(dirname "${BASH_SOURCE[0]}")
+CWD=$(realpath "${SCRIPT_DIR}")
+
+if [[ -z $1 ]]; then
+    echo "Should have 1 argument: cpu or gpu"
+    exit
+fi
+
+TYPE=$1
+shift;
+
+if [[ -n $2 ]]; then
+    docker build \
+        --tag 'trader-rl-postgres' \
+        --build-arg ID=$(id -u) \
+        --build-arg GI=$(id -g) \
+        -f "$CWD/docker/Dockerfile.backend" "$CWD"
+
+    mkdir -p "$CWD/data/postgres"
+    docker run \
+        --detach \
+        --publish 5432:5432 \
+        --tty \
+        --user "$(id -u):$(id -g)" \
+        --volume "$CWD/data/postgres":"/var/lib/postgresql/data/trader-data" \
+      trader-rl-postgres-dev
+    shift
+fi
+
+if [[ $TYPE == 'gpu' ]]; then
+    GPU=1
+else
+    GPU=0
+fi
+
+MEM=$(cat /proc/meminfo | grep 'MemTotal:' |  awk '{ print $2 }')
+CPUS=$(cat /proc/cpuinfo | grep -P 'processor.+[0-7]+' | wc -l)
+
+MEM_LIMIT=$((MEM/4*3))
+CPU_LIMIT=$((CPUS/4*3))
+
+if [ $CPU_LIMIT == 0 ];then
+    CPU_LIMIT=1
+fi
+
+if [ $GPU == 0 ]; then
+    N="trader-rl-cpu-dev"
+    docker build --tag $N -f "$CWD/docker/Dockerfile.cpu" "$CWD"
+else
+    N="trader-rl-gpu-dev"
+    docker build --tag $N -f "$CWD/docker/Dockerfile.gpu" "$CWD"
+fi
+
+docker rm -fv rl_trader_dev || true
+docker run \
+    --name 'rl_trader_dev' \
+    --user $(id -u):$(id -g) \
+    --entrypoint 'bash' \
+    --interactive \
+    --memory "${MEM_LIMIT}b" \
+    --cpus "${CPU_LIMIT}" \
+    --tty \
+    --volume "${CWD}":/code \
+    "$N"
diff --git a/docker/Dockerfile.backend b/docker/Dockerfile.backend
@@ -3,8 +3,9 @@ FROM postgres:11-alpine
 ARG ID=1000
 ARG GI=1000
 
-ENV POSTGRES_PASSWORD=rl-trader
-ENV POSTGRES_DB='rl-trader'
+ENV POSTGRES_USER=rl_trader
+ENV POSTGRES_PASSWORD=rl_trader
+ENV POSTGRES_DB='rl_trader'
 ENV PGDATA=/var/lib/postgresql/data/trader-data
 
-RUN adduser -D -u $ID btct
+RUN adduser -D -u $ID rl_trader
diff --git a/docker/Dockerfile.cpu b/docker/Dockerfile.cpu
@@ -1,11 +1,12 @@
 FROM python:3.6.8-jessie
 
+ADD ./requirements.base.txt /code/
 ADD ./requirements.no-gpu.txt /code/requirements.txt
 
 WORKDIR /code
 
 RUN apt-get update \
-    && apt-get install -y build-essential mpich
+    && apt-get install -y build-essential mpich libpq-dev
 
 # should merge to top RUN to avoid extra layers - for debug only :/
 RUN pip install -r requirements.txt
diff --git a/docker/Dockerfile.gpu b/docker/Dockerfile.gpu
@@ -1,9 +1,10 @@
 FROM python:3.6.8-jessie
 
+ADD ./requirements.base.txt /code/
 ADD ./requirements.txt /code/
 
 WORKDIR /code
 
 RUN apt-get update \
-    && apt-get install -y build-essential mpich \
+    && apt-get install -y build-essential mpich libpq-dev \
     && pip install -r requirements.txt
diff --git a/docker/Dockerfile.tests b/docker/Dockerfile.tests
@@ -1,10 +1,11 @@
 FROM python:3.6.8-jessie
 
+ADD ./requirements.base.txt /code/
 ADD ./requirements.no-gpu.txt /code/
 ADD ./requirements.tests.txt /code/requirements.txt
 
 WORKDIR /code
 
 RUN apt-get update \
-    && apt-get install -y build-essential mpich \
+    && apt-get install -y build-essential mpich libpq-dev \
     && pip install --progress-bar off --requirement requirements.txt
diff --git a/lib/RLTrader.py b/lib/RLTrader.py
@@ -4,6 +4,8 @@
 
 from os import path
 from typing import Dict
+
+from deco import concurrent
 from stable_baselines.common.base_class import BaseRLModel
 from stable_baselines.common.policies import BasePolicy, MlpPolicy
 from stable_baselines.common.vec_env import DummyVecEnv, SubprocVecEnv
@@ -160,6 +162,7 @@ def optimize_params(self, trial, n_prune_evals_per_trial: int = 2, n_tests_per_e
 
         return -1 * last_reward
 
+    @concurrent
     def optimize(self, n_trials: int = 100, n_parallel_jobs: int = 1, *optimize_params):
         try:
             self.optuna_study.optimize(