About Using exp3 and its variations of it to select coins to invest in.
The goal is to use exp3 and variations of it in order to select the the coins will gain to most rewards and the least regret over a cretian timeframe.Based on Bandit Algorithms chapter iii Exp3 algorithm and the expansions are based on Evaluation and Analysis of the Performance of the EXP3 Algorithm in Stochastic Environments
The database link : https://www.kaggle.com/sudalairajkumar/cryptocurrencypricehistory
Exp3 is a simple algorithm used for adversarial bandits.Adversarial bandits stands for a "game" which the player have to choose one action which its reward is sampled from a certian distirbution. Our particular implention is in the context of cryptocurrencies historical prices. The goal is to choose to most lucrative coin over time.
The first part is specific functions for the dataset. The second part is the implementaions of the algorithms:
- Basis - The implementaion with constant learning rate
- Expansion 1 - Implementaion with decarsing learning rate
- Expansion 2 - Uses some confidence parameter δ to create a lower bound on the regret in each round and deletes every action that its regret may be higher than the created lower bound
Afterwards we compared to regret defined as
to the bound 2 √(2nKlog(K)) which is proven in the paper. The second part was a comparission between the regrets of the diffrent algorithms.
I will use Google Colab as an example, but similar process can be performed on other notebook editors