Generative AI for End-to-End Limit Order Book Modelling

A Token-Level Autoregressive Generative Model of Message Flow Using a Deep State Space Network

This repository provides the implementation for the paper: Generative AI for End-to-End Limit Order Book Modelling. The preprint is available here.

The repository is a fork of the original S5 repository.

Developing a generative model of realistic order flow in financial markets is a challenging open problem, with numerous applications for market participants. Addressing this, we propose the first end-to-end autoregressive generative model that generates tokenized limit order book (LOB) messages. These messages are interpreted by a Jax-LOB simulator, which updates the LOB state. To handle long sequences efficiently, the model employs \emph{simplified structured state-space layers} to process sequences of order book states and tokenized messages. Using LOBSTER data of NASDAQ equity LOBs, we develop a custom tokenizer for message data, converting groups of successive digits to tokens, similar to tokenization in large language models.

Requirements & Installation

To install required packages, run pip install -r requirements.txt.

The GPU installation of JAX can cause problems, further instructions are available here.

Data Download

The data used is NASDAQ LOB data from LOBSTER. After downloading and unpacking the data files, they need to be pre-processed for model training. An example command for GOOG is as follows:

python lob/preproc.py --data_dir /path/to/LOBS5/data/GOOG/ --save_dir /path/to/LOBS5/data/GOOG/ --n_tick_range 500 --use_raw_book_repr

Repository Structure

Directories and files that ship with GitHub repo:

lob/                    Source code for LOB models, datasets, etc.
    ar_pred.ipynb           Test set evaluations and plotting for trained model.
    dataloading.py          Dataloading functions.
    encoding.py             Message tokenization: encoding and decoding
    evaluation.py           Model evaluation logic
    inference.py            Logic for model inference loop
    init_train.py           Train state initialisation
    lob_seq_model.py        Defines LOB deep sequence model that consist of stacks of S5
    lobster_dataloader.py   Defines dataset and dataloading
    preproc.py              Pre-processes LOBSTER data for the model
    run_eval.py             Script to run model inference with trained model
    sweep.py                Hyperparamter sweep (WanDB)
    train_helpers.py        Functions for optimization, training and evaluation steps.
    train.py                Training loop code.
    validation_helpers.py   Helper functions for model validation
s5/                     Original S5 code
bin/                    Shell scripts for downloading data and running experiments.
requirements.txt            Package requirements
run_train.py            Training loop entrypoint.

Experiments / Paper Results

Paper results and plots are calculated in lob\ar_pred.ipynb.

Citation

Please use the following when citing our work:

@article{nagy2023generative,
  doi = {},
  url = {https://arxiv.org/abs/xxxx.xxxxx},
  author = {Nagy, Peer and Frey, Sascha and Sapora, Silvia and Li, Kang and Calinescu, Anisoara and Zohren, Stefan and Foerster, Jakob},
  keywords = {generative AI, structured state space models, limit order books, ML},
  title = {Generative AI for End-to-End Limit Order Book Modelling: A Token-Level Autoregressive Generative Model of Message Flow Using a Deep State Space Network},
  publisher = {arXiv},
  year = {2023},
  copyright = {Creative Commons Attribution 4.0 International}
}

Please reach out if you have any questions.

Name		Name	Last commit message	Last commit date
Latest commit History 88 Commits
AlphaTrade @ 3241a2a		AlphaTrade @ 3241a2a
bin/run_experiments		bin/run_experiments
docs		docs
lob		lob
s5		s5
.gitignore		.gitignore
.gitmodules		.gitmodules
CITATION.cff		CITATION.cff
LICENSE		LICENSE
README.md		README.md
__init__.py		__init__.py
requirements.txt		requirements.txt
run_train.py		run_train.py
setup.py		setup.py
test_sim.ipynb		test_sim.ipynb
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Generative AI for End-to-End Limit Order Book Modelling

A Token-Level Autoregressive Generative Model of Message Flow Using a Deep State Space Network

Requirements & Installation

Data Download

Repository Structure

Experiments / Paper Results

Citation

About

Releases

Packages

Languages

License

peernagy/LOBS5

Folders and files

Latest commit

History

Repository files navigation

Generative AI for End-to-End Limit Order Book Modelling

A Token-Level Autoregressive Generative Model of Message Flow Using a Deep State Space Network

Requirements & Installation

Data Download

Repository Structure

Experiments / Paper Results

Citation

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages