-
Notifications
You must be signed in to change notification settings - Fork 38
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Add option to correct eigenvalues of Shampoo's preconditioner
Summary: This update is based on #27, developed by Runa Eschenhagen (runame) and Tsung-Hsien Lee (tsunghsienlee). The research idea in this update originated from Runa Eschenhagen's internship at The Fundamental AI Research (FAIR) at Meta during the summer of 2024. Concurrently, Runa Eschenhagen, Michael Shi (hjmshi), Aaron Defazio (adefazio) worked on this method, which was also empirically evaluated on language models by Nikhil Vyas et al. [3], showing promising results. This update enables approximately correcting the eigenvalues and running Adam in the eigenbasis of Shampoo's preconditioner. A variation of this method was first proposed for K-FAC by George et al. [1], and Anil et al. [2] noted its applicability to Shampoo in Appendix B, although they did not present empirical results or further discussion. References: 1. [Fast Approximate Natural Gradient Descent in a Kronecker-factored Eigenbasis](https://arxiv.org/abs/1806.03884). Thomas George, César Laurent, Xavier Bouthillier, Nicolas Ballas, Pascal Vincent. NeurIPS, 2018. 2. [Scalable Second-Order Optimization for Deep Learning](https://arxiv.org/pdf/2002.09018.pdf). Rohan Anil, Vineet Gupta, Tomer Koren, Kevin Regan, and Yoram Singer. Tech Report, 2021. 3. [SOAP: Improving and Stabilizing Shampoo using Adam](https://arxiv.org/abs/2409.11321). Nikhil Vyas, Depen Morwani, Rosie Zhao, Itai Shapira, David Brandfonbrener, Lucas Janson, Sham Kakade. Tech Report, 2024. Reviewed By: hjmshi Differential Revision: D65402620 fbshipit-source-id: 8ea4f761cfae04c5622a968cb499654816e4aa3e
- Loading branch information
1 parent
cc0a1ee
commit f3451cd
Showing
7 changed files
with
1,731 additions
and
336 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.