Activation Hessian computation runtime optimization #1092

ofirgo · 2024-06-02T14:28:54Z

Pull Request Description:

Improve Activation Hessian computation runtime for GPTQ and Mixed precision with the following optimizations:

Enable batch computation.
Enable computation on a set of nodes (instead of a single node only).
Other minor loop and implementation modifications.

Major design remarks:

TraceHessianRequest receives a list of target nodes (target_nodes) instead of a single BaseNode.
HessianInfoService produces a batch of samples for each computation iteration. We also fixed a bug where the HessianInfoService representative generator would have been initiated for each iteration instead of running on the generator from end-to-end.
- The output of the "fetch" call of HessianInfoService is of the structure: List (per target nodes) of List (per image) of Hessian approximations (tensor).
- The service's cache mechanism still saves results per request for a single node. We split and construct requests per-node to save and retrieve results for a certain node.
- Batch computation requires keeping track of remaining samples from a given representative dataset batch, in case the requested Hessian computation batch is smaller (we don't want to "throw" samples away).
- In addition, we assume that the Hessians computation batch_size is <= to the representative dataset batch size.
Weights Hessian computation is still limited to a single image and per-node computation.
Hessian results tensor includes a batch dimension.
Default values for the number of samples and number of iterations for Hessians computation for GPTQ and Mixed precision have been modified.

Checklist before requesting a review:

I set the appropriate labels on the pull request.
I have added/updated the release note draft (if necessary).
I have updated the documentation to reflect my changes (if necessary).
All function and files are well documented.
All function and classes have type hints.
There is a licenses in all file.
The function and variable names are informative.
I have checked for code duplications.
I have added new unittest (if necessary).

…orch (WIP)

model_compression_toolkit/core/common/hessian/hessian_info_service.py

model_compression_toolkit/core/keras/hessian/activation_trace_hessian_calculator_keras.py

reuvenperetz · 2024-06-05T10:00:17Z

model_compression_toolkit/core/common/mixed_precision/mixed_precision_quantization_config.py

@@ -44,6 +45,7 @@ def __init__(self,
            norm_scores (bool): Whether to normalize the returned scores for the weighted distance metric (to get values between 0 and 1).
            refine_mp_solution (bool): Whether to try to improve the final mixed-precision configuration using a greedy algorithm that searches layers to increase their bit-width, or not.
            metric_normalization_threshold (float): A threshold for checking the mixed precision distance metric values, In case of values larger than this threshold, the metric will be scaled to prevent numerical issues.
+            hessian_batch_size (int): The Hessian computation batch size. used only if using mixed precision with Hessian-based objective.


Used, but not crucial.

Ofir Gordon added 4 commits May 29, 2024 16:38

Implement activation Hessian computation runtime optimizations in pyt…

7f17b06

…orch (WIP)

Merge branch 'main' into act-hessian-runtime-torch

5ff5bbd

Modification for keras hessians

9bd074d

Organize code and documentation

e359768

github-actions bot added auto:core auto:gptq auto:tests labels Jun 2, 2024

Ofir Gordon added 7 commits June 2, 2024 17:30

fix

f545f5d

fix pruning mean over node score results

222e44e

fix pytorch pruning tests

ef156de

fix typehint

f272e22

Merge branch 'main' into act-hessian-runtime-torch

caf821c

Add pragma to some error messages

12980d3

Fix random fails in mixed precision test.

ff21edf

ofirgo requested a review from reuvenperetz June 4, 2024 05:41

reuvenperetz reviewed Jun 4, 2024

View reviewed changes

ofirgo added the pr: refactoring/code cleanup label Jun 5, 2024

PR fixes

a9c83e4

reuvenperetz approved these changes Jun 5, 2024

View reviewed changes

Ofir Gordon added 3 commits June 5, 2024 10:58

add missing hessian service tests

666a8f1

get hessian batch size from config in mp and gptq

f622d57

adding an option to create gptq config with given hessian batch size

88ee7d4

reuvenperetz reviewed Jun 5, 2024

View reviewed changes

reuvenperetz approved these changes Jun 5, 2024

View reviewed changes

ofirgo merged commit d13319f into sony:main Jun 5, 2024
27 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Activation Hessian computation runtime optimization #1092

Activation Hessian computation runtime optimization #1092

ofirgo commented Jun 2, 2024 •

edited

Loading

reuvenperetz Jun 5, 2024 •

edited

Loading

Activation Hessian computation runtime optimization #1092

Activation Hessian computation runtime optimization #1092

Conversation

ofirgo commented Jun 2, 2024 • edited Loading

Pull Request Description:

Checklist before requesting a review:

reuvenperetz Jun 5, 2024 • edited Loading

Choose a reason for hiding this comment

ofirgo commented Jun 2, 2024 •

edited

Loading

reuvenperetz Jun 5, 2024 •

edited

Loading