Base research repository for the Hazan Lab @ Princeton for experimenting with the Spectral State Space Model on linear dynamical systems.
This is a research repository, not a polished library. Expect to see magic numbers, hard-coded paths, etc.
Note: Please use uv. You'll have more energy. Your skin will be clearer. Your eye sight will improve.
Create a virtual environment with one of the following options:
Conda:
conda create -n stu pytorch pytorch-cuda=12.4 uv -c pytorch -c nvidia -c conda-forge -y
uv:
uv venv --prompt stu .venv
Python/pip:
python3 -m venv --prompt stu .venv
Don't forget to activate your environment once you've created it.
Note: If you want to use Flash FFT and/or Flash Attention, you will need to have a CUDA-enabled device. Please see their repositories for further instructions on installation.
In the root folder of the project, where pyproject.toml
resides, you can install the required packages with:
uv:
uv pip install -e .
Python/pip:
pip install -e .
To install FlashFFTConv, you can run the following command:
module load gcc-toolset/13 # This is Della-specific; make sure you have a valid C/C++ compiler
pip install git+https://github.com/HazyResearch/flash-fft-conv.git#subdirectory=csrc/flashfftconv
pip install git+https://github.com/HazyResearch/flash-fft-conv.git
First, make sure you cd
into the spectral_ssm
folder.
To train the STU model, run
python train_stu.py
To train the Transformer model, run
python train_transformer.py
You can adjust the training configurations for the models in their respective config.json
files.
Some of the utility functions are adapted from Daniel Suo's JAX implementation of STU.
Special thanks to (in no particular order):
- Naman Agarwal, Elad Hazan, and the authors of the Spectral State Space Models paper
- Yagiz Devre, Evan Dogariu, Isabel Liu, Windsor Nguyen
We welcome contributors to:
- Submit pull requests
- Report issues
- Help improve the project overall
This free open-source software is MIT-licensed. See the LICENSE file for more details.
If you use this repository or find our work valuable, please consider citing it:
@misc{spectralssm,
title={Spectral State Space Models},
author={Naman Agarwal and Daniel Suo and Xinyi Chen and Elad Hazan},
year={2024},
eprint={2312.06837},
archivePrefix={arXiv},
primaryClass={cs.LG},
url={https://arxiv.org/abs/2312.06837},
}
@misc{flashstu,
title={Flash STU: Fast Spectral Transform Units},
author={Y. Isabel Liu and Windsor Nguyen and Yagiz Devre and Evan Dogariu and Anirudha Majumdar and Elad Hazan},
year={2024},
eprint={2409.10489},
archivePrefix={arXiv},
primaryClass={cs.LG},
url={https://arxiv.org/abs/2409.10489},
}