Add control over tolerance for failed amortized computations #64

runame · 2024-12-19T11:42:55Z

No description provided.

distributed_shampoo/utils/shampoo_preconditioner_list.py

distributed_shampoo/utils/tests/shampoo_preconditioner_list_test.py

facebook-github-bot · 2024-12-20T18:43:43Z

@tsunghsienlee has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

facebook-github-bot · 2024-12-22T00:28:28Z

@tsunghsienlee merged this pull request in 0921f45.

tsunghsienlee · 2024-12-24T00:25:47Z

distributed_shampoo/tests/shampoo_types_test.py

+class ShampooPreconditionerConfigTest(
+    AbstractPreconditionerConfigTest.PreconditionerConfigTest[
+        Type[ShampooPreconditionerConfig]
+    ]
+):
+    def _get_preconditioner_config_type(
+        self,
+    ) -> Type[ShampooPreconditionerConfig]:
+        return ShampooPreconditionerConfig
+
+
+class EigenvalueCorrectedShampooPreconditionerConfigTest(
+    AbstractPreconditionerConfigTest.PreconditionerConfigTest[
+        Type[EigenvalueCorrectedShampooPreconditionerConfig]
+    ]
+):
+    def _get_preconditioner_config_type(
+        self,
+    ) -> Type[EigenvalueCorrectedShampooPreconditionerConfig]:
+        return EigenvalueCorrectedShampooPreconditionerConfig


Hi @runame ,

It seems these two tests do not discovered by the test discovery; if you check the CI before and after this pull request, the total number of tests ran does not change, and I believe this is due to those two classes here are still considered as abstract class so they don't get instantiated at all.

I think the two tests are actually discovered and run by the CI. Note that the "before" CI does not actually reflect the state before the two tests are added, since it is merely the PR that was created before it (#63 before #64). See the commit history for the actual order in which the PRs were merged. Now we can compare the CI before and after the two tests were added and see that the number of tests increased from 23 to 25.

Finally, I also verified locally that commenting out the two tests results in two less tests being run.

I see. I check the commit history, and you are right the two tests are discovered in here.

Now it seems the issue is happened in the the Meta internal test discovery could not find those two tests so I was wondering the current setup is the culprit. If I run the shampoo_types_test.py in Meta internal, it will only discover 5 test cases, but it should be 7 test cases.

We might need a different setup to accommodate this unfortunately.

Oh interesting. Do you have any idea why the tests are not discovered internally?

Generic is the reason because if we don't use that, it will be discovered. However, how and why are something I don't know, I will create a minimum example with Generic to verify this.

#72 is a better design to resolve this but we might want to figure this out in the future for curiosity sake.

Summary: Current tests on the configs with subclasses relied on explicit instantiating subclasses to test it. There are some limitations on this approach: 1. It is hard to catch newly added subclasses. 2. Due to some unknown interactions with `typing.Generic`, [the current `buck` test discovery mechanism is not able to discover the tests in it](facebookresearch#64 (comment)). To resolve this, this diff refactors those tests with [`type.__subclasses()`](https://docs.python.org/3/reference/datamodel.html#type.__subclasses__) to tests the configs with subclasses. Differential Revision: D67652761

) Summary: Current tests on the configs with subclasses relied on explicit instantiating subclasses to test it. There are some limitations on this approach: 1. It is hard to catch newly added subclasses. 2. Due to some unknown interactions with `typing.Generic`, [the current `buck` test discovery mechanism is not able to discover the tests in it](facebookresearch#64 (comment)). To resolve this, this diff refactors those tests with [`type.__subclasses()`](https://docs.python.org/3/reference/datamodel.html#type.__subclasses__) to tests the configs with subclasses. Reviewed By: anana10c Differential Revision: D67652761

Summary: Pull Request resolved: #72 Current tests on the configs with subclasses relied on explicit instantiating subclasses to test it. There are some limitations on this approach: 1. It is hard to catch newly added subclasses. 2. Due to some unknown interactions with `typing.Generic`, [the current `buck` test discovery mechanism is not able to discover the tests in it](#64 (comment)). To resolve this, this diff refactors those tests with [`type.__subclasses__`](https://docs.python.org/3/reference/datamodel.html#type.__subclasses__) to tests the configs with subclasses. Reviewed By: anana10c Differential Revision: D67652761 fbshipit-source-id: 58d3131052f0f0d01e37c49506761209f069e38a

runame added 22 commits December 6, 2024 15:27

Refactor matrix functions types

55f1413

Refactor Shampoo types

9291d15

Adjust UI and docs

dac3ac1

Replace preconditioner_computation_config with preconditioner_config

7e20cf3

Fix docstring

1bbaa2f

Add tolerance for amortized computation failures

32c5df5

Add test for amortized computation failure tolerance

65f4f27

Adjust abstractmethod test

9a137b5

Add check that tolerance value non-negative

03245f5

Make failure tracking coarser

527c35e

Reduce code duplication

01dde5f

Merge branch 'main' into configs-refactor

6d9810b

Set default values

d3a10af

Merge branch 'configs-refactor' into fail-counter

709ab1c

Merge branch 'fail-counter' into fail-counter-v2

196a781

Fix defaults with default_factory

273e0a1

Merge branch 'configs-refactor' into fail-counter

d53ffc6

Merge branch 'fail-counter' into fail-counter-v2

8f19d2f

Improve naming

a92348d

Simplify test

736c76d

Merge branch 'main' into fail-counter-v2

54b8879

Fix test

cf08da0

runame added the enhancement New feature or request label Dec 19, 2024

runame requested a review from tsunghsienlee December 19, 2024 11:42

runame self-assigned this Dec 19, 2024

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Dec 19, 2024

tsunghsienlee reviewed Dec 19, 2024

View reviewed changes

runame added 3 commits December 19, 2024 19:49

Use keywords explicitly

5b37d84

Merge branch 'main' into fail-counter-v2

9e0e46e

Revert outdated change

7793429

runame added 2 commits December 19, 2024 20:37

Simplify no warnings assertion

98051d2

Remove leftover variable

8ea571e

runame requested a review from tsunghsienlee December 19, 2024 20:54

runame added 3 commits December 20, 2024 16:56

Improve readability of call count check

c098c6a

Merge branch 'main' into fail-counter-v2

701e5e9

Further improve readability of test

16853ea

tsunghsienlee approved these changes Dec 20, 2024

View reviewed changes

facebook-github-bot closed this in 0921f45 Dec 22, 2024

facebook-github-bot added the Merged label Dec 22, 2024

runame deleted the fail-counter-v2 branch December 22, 2024 00:44

tsunghsienlee reviewed Dec 24, 2024

View reviewed changes

tsunghsienlee mentioned this pull request Dec 26, 2024

Leverage __subclasses__() to improve configs test #72

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add control over tolerance for failed amortized computations #64

Add control over tolerance for failed amortized computations #64

runame commented Dec 19, 2024

facebook-github-bot commented Dec 20, 2024

facebook-github-bot commented Dec 22, 2024

tsunghsienlee Dec 24, 2024

runame Dec 24, 2024

tsunghsienlee Dec 24, 2024

runame Dec 25, 2024

tsunghsienlee Dec 26, 2024

tsunghsienlee Dec 26, 2024

Add control over tolerance for failed amortized computations #64

Add control over tolerance for failed amortized computations #64

Conversation

runame commented Dec 19, 2024

facebook-github-bot commented Dec 20, 2024

facebook-github-bot commented Dec 22, 2024

tsunghsienlee Dec 24, 2024

Choose a reason for hiding this comment

runame Dec 24, 2024

Choose a reason for hiding this comment

tsunghsienlee Dec 24, 2024

Choose a reason for hiding this comment

runame Dec 25, 2024

Choose a reason for hiding this comment

tsunghsienlee Dec 26, 2024

Choose a reason for hiding this comment

tsunghsienlee Dec 26, 2024

Choose a reason for hiding this comment