Skip to content

hazan-lab/stu

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Spectral SSM

A blue Dragon in a twister


Base research repository for the Hazan Lab @ Princeton for experimenting with the Spectral State Space Model on linear dynamical systems.

This is a research repository, not a polished library. Expect to see magic numbers, hard-coded paths, etc.

Setup

Note: Please use uv. You'll have more energy. Your skin will be clearer. Your eye sight will improve.

1. Virtual environment (optional):

Create a virtual environment with one of the following options:

Conda:

conda create -n stu pytorch pytorch-cuda=12.4 uv -c pytorch -c nvidia -c conda-forge -y

uv:

uv venv --prompt stu .venv

Python/pip:

python3 -m venv --prompt stu .venv

Don't forget to activate your environment once you've created it.

2. Installing packages:

Note: If you want to use Flash FFT and/or Flash Attention, you will need to have a CUDA-enabled device. Please see their repositories for further instructions on installation.

In the root folder of the project, where pyproject.toml resides, you can install the required packages with:

uv:

uv pip install -e .

Python/pip:

pip install -e .

To install FlashFFTConv, you can run the following command:

module load gcc-toolset/13  # This is Della-specific; make sure you have a valid C/C++ compiler
pip install git+https://github.com/HazyResearch/flash-fft-conv.git#subdirectory=csrc/flashfftconv
pip install git+https://github.com/HazyResearch/flash-fft-conv.git

Training

First, make sure you cd into the spectral_ssm folder.

To train the STU model, run

python train_stu.py

To train the Transformer model, run

python train_transformer.py

You can adjust the training configurations for the models in their respective config.json files.

Acknowledgments

Some of the utility functions are adapted from Daniel Suo's JAX implementation of STU.

Special thanks to (in no particular order):

  • Naman Agarwal, Elad Hazan, and the authors of the Spectral State Space Models paper
  • Yagiz Devre, Evan Dogariu, Isabel Liu, Windsor Nguyen

Contributions

We welcome contributors to:

  • Submit pull requests
  • Report issues
  • Help improve the project overall

License

This free open-source software is MIT-licensed. See the LICENSE file for more details.

Citation

If you use this repository or find our work valuable, please consider citing it:

@misc{spectralssm,
      title={Spectral State Space Models}, 
      author={Naman Agarwal and Daniel Suo and Xinyi Chen and Elad Hazan},
      year={2024},
      eprint={2312.06837},
      archivePrefix={arXiv},
      primaryClass={cs.LG},
      url={https://arxiv.org/abs/2312.06837}, 
}
@misc{flashstu,
      title={Flash STU: Fast Spectral Transform Units}, 
      author={Y. Isabel Liu and Windsor Nguyen and Yagiz Devre and Evan Dogariu and Anirudha Majumdar and Elad Hazan},
      year={2024},
      eprint={2409.10489},
      archivePrefix={arXiv},
      primaryClass={cs.LG},
      url={https://arxiv.org/abs/2409.10489}, 
}

About

The Spectral State Space Model in PyTorch.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages