Skip to content

Commit

Permalink
Add citing PyTorch Distributed Shampoo section in README.md
Browse files Browse the repository at this point in the history
Summary: Add a section for how to cite PyTorch Distributed Shampoo.

Reviewed By: hjmshi

Differential Revision: D65094885

fbshipit-source-id: cf0dbb27b879e184751b9d1cf3a2cc56ca855bd8
  • Loading branch information
tsunghsienlee authored and facebook-github-bot committed Nov 8, 2024
1 parent 15281a6 commit ddde667
Showing 1 changed file with 17 additions and 2 deletions.
19 changes: 17 additions & 2 deletions distributed_shampoo/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,6 @@ Developers:
- Hao-Jun Michael Shi (Meta Platforms, Inc.)
- Tsung-Hsien Lee
- Anna Cai (Meta Platforms, Inc.)
- Runa Eschenhagen (University of Cambridge)
- Shintaro Iwasaki (Meta Platforms, Inc.)
- Ke Sang (Meta Platforms, Inc.)
- Wang Zhou (Meta Platforms, Inc.)
Expand Down Expand Up @@ -44,7 +43,7 @@ Key distinctives of this implementation include:

We have tested this implementation on the following versions of PyTorch:

- PyTorch >= 2.2;
- PyTorch >= 2.0;
- Python >= 3.10;
- CUDA 11.3-11.4; 12.2+;

Expand Down Expand Up @@ -476,6 +475,22 @@ When encountering those errors, following are things you could try:
3. Increase `start_preconditioning_step`.
4. Consider applying gradient clipping.
## Citing PyTorch Distributed Shampoo
If you use PyTorch Distributed Shampoo in your work, please use the following BibTeX entry.
```BibTeX
@misc{shi2023pytorchshampoo,
title={A Distributed Data-Parallel PyTorch Implementation of the Distributed Shampoo Optimizer for Training Neural Networks At-Scale},
author={Hao-Jun Michael Shi and Tsung-Hsien Lee and Shintaro Iwasaki and Jose Gallego-Posada and Zhijing Li and Kaushik Rangadurai and Dheevatsa Mudigere and Michael Rabbat},
howpublished={\url{https://github.com/facebookresearch/optimizers/tree/main/distributed_shampoo}},
year ={2023},
eprint={2309.06497},
archivePrefix={arXiv},
primaryClass={cs.LG}
}
```

## References

1. [Shampoo: Preconditioned Stochastic Tensor Optimization](https://proceedings.mlr.press/v80/gupta18a/gupta18a.pdf). Vineet Gupta, Tomer Koren, and Yoram Singer. International Conference on Machine Learning, 2018.
Expand Down

0 comments on commit ddde667

Please sign in to comment.