Add: targets and ignore inference for sparse compression #191

rahul-tuli · 2024-09-20T18:05:09Z

This PR adds support to automatically infer targets and ignore lists for sparse compression

The lists thus generated will be used to to ignore things like layernorms, and embeddings

The logic for adding to targets:

layer must have sparsity > threshold (set to 20% for now)
layer must follow the sparsity structure defined

Otherwise it's added to the ignore list

Note: We also perform a reduction of the targets, and ignore list to make them cleaner and easily readable inside config.json

The compression_config inside config.json will now look like:

"compression_config": {
    "sparsity_config": {
      "format": "sparse-bitmask",
      "global_sparsity": 0.12233772669884009,
      "ignore": [
        "Embedding",
        "lm_head",
        "LlamaRotaryEmbedding",
        "SiLU",
        "LlamaRMSNorm"
      ],
      "registry_requires_subclass": false,
      "sparsity_structure": "2:4",
      "targets": [
        "Linear"
      ]
    }
  },

After review comments we were able to further simplify compression_config, now we only include modules in ignore list if they are targetted, now the compression config looks like:

  "compression_config": {
    "sparsity_config": {
      "format": "dense",
      "global_sparsity": 0.12233772669884009,
      "ignore": [
        "lm_head"
      ],
      "registry_requires_subclass": false,
      "sparsity_structure": "2:4",
      "targets": [
        "Linear"
      ]
    }
  }

Note: relies on neuralmagic/compressed-tensors#159

github-actions · 2024-09-20T18:05:20Z

👋 Hi! Thank you for contributing to llm-compressor. Please add the ready label when the PR is ready for review.

src/llmcompressor/transformers/compression/helpers.py

kylesayrs

Looks good! I'd prefer to call "layers" modules, since layers is already used to describe transformer blocks, but I think readers still understand.

rahul-tuli · 2024-09-20T18:55:02Z

Looks good! I'd prefer to call "layers" modules, since layers is already used to describe transformer blocks, but I think readers still understand.

Let's be consistent and use module/submodule over layer wherever possible

src/llmcompressor/transformers/compression/helpers.py

@kylesayrs

recommended by @kylesayrs

@Satrat

suggested by @Satrat

mgoin

Nice LGTM

src/llmcompressor/transformers/compression/helpers.py

dsikka · 2024-09-24T13:55:20Z

@rahul-tuli The Sparsification tests are failing with a different error than the shared memory issue we've been seeing. Can you look into it?

rahul-tuli · 2024-09-24T14:23:48Z

@rahul-tuli The Sparsification tests are failing with a different error than the shared memory issue we've been seeing. Can you look into it?

Fixed!

rahul-tuli mentioned this pull request Sep 20, 2024

Add: targets and ignore to sparsity compression config neuralmagic/compressed-tensors#159

Merged

rahul-tuli marked this pull request as ready for review September 20, 2024 18:22

rahul-tuli requested review from Satrat, markurtz, mgoin, kylesayrs and dsikka and removed request for markurtz September 20, 2024 18:23

rahul-tuli self-assigned this Sep 20, 2024

kylesayrs reviewed Sep 20, 2024

View reviewed changes

src/llmcompressor/transformers/compression/helpers.py Show resolved Hide resolved

kylesayrs previously approved these changes Sep 20, 2024

View reviewed changes

rahul-tuli dismissed kylesayrs’s stale review via 25c0a2e September 20, 2024 18:53

dsikka reviewed Sep 20, 2024

View reviewed changes

src/llmcompressor/transformers/compression/helpers.py Show resolved Hide resolved

kylesayrs previously approved these changes Sep 20, 2024

View reviewed changes

Satrat reviewed Sep 23, 2024

View reviewed changes

src/llmcompressor/transformers/compression/helpers.py Show resolved Hide resolved

rahul-tuli dismissed kylesayrs’s stale review via 586d8dd September 23, 2024 13:59

rahul-tuli added 3 commits September 23, 2024 14:01

Add: targets and ignore inference for sparse compression

535ad43

Use submodule over layer for naming

d561bd0

recommended by @kylesayrs

Only add to ignore if module targetted

5e00111

suggested by @Satrat

rahul-tuli force-pushed the add-ignore-and-targets-for-compression branch from 586d8dd to 5e00111 Compare September 23, 2024 14:01

horheynm previously approved these changes Sep 23, 2024

View reviewed changes

mgoin previously approved these changes Sep 23, 2024

View reviewed changes

Merge branch 'main' into add-ignore-and-targets-for-compression

eee256e

rahul-tuli commented Sep 23, 2024

View reviewed changes

src/llmcompressor/transformers/compression/helpers.py Outdated Show resolved Hide resolved

Fix: Comment typo

fc8cf83

rahul-tuli dismissed stale reviews from horheynm and mgoin via fc8cf83 September 23, 2024 23:09

Merge branch 'main' into add-ignore-and-targets-for-compression

8d2046f

Merge branch 'main' into add-ignore-and-targets-for-compression

6428aaf

kylesayrs previously approved these changes Sep 24, 2024

View reviewed changes

mgoin previously approved these changes Sep 24, 2024

View reviewed changes

horheynm previously approved these changes Sep 24, 2024

View reviewed changes

Fix breaking tests

edd1e43

rahul-tuli dismissed stale reviews from horheynm, mgoin, and kylesayrs via edd1e43 September 24, 2024 14:23

dsikka merged commit 7d5a0b6 into main Sep 24, 2024
6 of 7 checks passed

dsikka deleted the add-ignore-and-targets-for-compression branch September 24, 2024 15:31

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add: targets and ignore inference for sparse compression #191

Add: targets and ignore inference for sparse compression #191

rahul-tuli commented Sep 20, 2024 •

edited

Loading

github-actions bot commented Sep 20, 2024

kylesayrs left a comment

rahul-tuli commented Sep 20, 2024

mgoin left a comment

dsikka commented Sep 24, 2024

rahul-tuli commented Sep 24, 2024

Add: targets and ignore inference for sparse compression #191

Add: targets and ignore inference for sparse compression #191

Conversation

rahul-tuli commented Sep 20, 2024 • edited Loading

github-actions bot commented Sep 20, 2024

kylesayrs left a comment

Choose a reason for hiding this comment

rahul-tuli commented Sep 20, 2024

mgoin left a comment

Choose a reason for hiding this comment

dsikka commented Sep 24, 2024

rahul-tuli commented Sep 24, 2024

rahul-tuli commented Sep 20, 2024 •

edited

Loading