Skip to content

Commit

Permalink
Merge branch 'Superalgos:develop' into develop
Browse files Browse the repository at this point in the history
  • Loading branch information
quantum8 authored Sep 16, 2022
2 parents cd0f3b0 + 8ca2f54 commit 9c167da
Show file tree
Hide file tree
Showing 12 changed files with 1,536 additions and 575 deletions.
801 changes: 801 additions & 0 deletions Bitcoin-Factory/Forecast-Client/notebooks/Bitcoin_Factory_RL.py

Large diffs are not rendered by default.

126 changes: 126 additions & 0 deletions Bitcoin-Factory/ReadMeReinforcementLearning.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,126 @@
# Reinforcement Learning
## 💫 1. Introduction
[Reinforcement Learning](https://en.wikipedia.org/wiki/Reinforcement_learning) is a term used to describe a special machine learning process. The typical framing of a Reinforcement Learning (RL) scenario: An agent takes actions in an environment, which is interpreted into a reward and a representation of the state, which are fed back into the agent.

![Learning framework](https://upload.wikimedia.org/wikipedia/commons/1/1b/Reinforcement_learning_diagram.svg "RL Framework")

In our usage case, the environment is a stock trading one and the possible actions are buy,sell or hold. The reward will be our gain or loss. Based on this reward the agent will learn how to trade better. The process of learning is done with a so called [Proximal Policy Optimization (PPO)](https://en.wikipedia.org/wiki/Proximal_Policy_Optimization).

At the end the agent will provide us an action for the current candle. The possible actions at the moment are:
* 0 -> buy long
* 1 -> sell
* 2 -> hold

For buy and sell signals an additionaly percentage is provided.

## 📒 2. Configuration
The basic config has to be done as pointed out in [Bitcoin Factory ReadMe](./README.md). Hereafter the differences for RL are shown.
### 2.1 Testserver config
To run a Testserver for RL und need to change the configuration of the testserver node in SA. First you need to define the python script, which should be used for the docker sessions on the clients.
Second you need to define the range of parameters to be tested: For example the learning rate and so on.
```js
{
...
"pythonScriptName": "Bitcoin_Factory_RL.py",
...
"parametersRanges": {
"LIST_OF_ASSETS": [
[
"BTC"
]
],
"LIST_OF_TIMEFRAMES": [
[
"01-hs"
],
[
"02-hs"
]
],
"NUMBER_OF_LAG_TIMESTEPS": [
10
],
"PERCENTAGE_OF_DATASET_FOR_TRAINING": [
80
],
"NUMBER_OF_EPOCHS": [
750
],
"NUMBER_OF_LSTM_NEURONS": [
50
],
"TIMESTEPS_TO_TRAIN": [
1e7
],
"OBSERVATION_WINDOW_SIZE": [
24,
48
],
"INITIAL_QUOTE_ASSET": [
1000
],
"INITIAL_BASE_ASSET": [
0
],
"TRADING_FEE": [
0.01
],
"ENV_NAME": [
"SimpleTrading"
],
"ENV_VERSION": [
1
],
"REWARD_FUNCTION": [
"unused"
],
"EXPLORE_ON_EVAL": [
"unused"
],
"ALGORITHM": [
"PPO"
],
"ROLLOUT_FRAGMENT_LENGTH": [
200
],
"TRAIN_BATCH_SIZE": [
2048
],
"SGD_MINIBATCH_SIZE": [
64
],
"BATCH_MODE": [
"complete_episodes"
],
"FC_SIZE": [
256
],
"LEARNING_RATE": [
0.00001
],
"GAMMA": [
0.95
]
}
}
```
### 2.2 Testclient config
No special config is needed.
But run only one client per machine (The python script takes care of parallel execution on its own).

## 💡 3. Results
> __Note__
> The processing of one test case on the client takes roughly 2h-6h on a recent System.
The provided Timeseries values are devided in 3 parts (Train, Test, Validate). The first one (train) is used to train the network. The second one (test) is used by the PPO-agent to evaluate the current net during the learning process. The third part is never seen by the agent, it is used to validate if the trained model is able to trade profitable on unseen data.

The python script produces 3 charts to visualize the results. The follwing 3 examples are preliminary - made by a not good trained agent.
![Example Train Results](docs/BTC_train.png) "BTC train")
![Example Test Results](docs/BTC_test.png) "BTC test")
![Example Validate Results](docs/BTC_validate.png) "BTC validate")

## 🤝 4. Support

Contributions, issues, and feature requests are welcome!

Give a ⭐️ if you like this project or even better become a part of the Superalgos community!
Loading

0 comments on commit 9c167da

Please sign in to comment.