Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add: targets and ignore inference for sparse compression #191

Merged
merged 8 commits into from
Sep 24, 2024

Conversation

rahul-tuli
Copy link
Collaborator

@rahul-tuli rahul-tuli commented Sep 20, 2024

This PR adds support to automatically infer targets and ignore lists for sparse compression

The lists thus generated will be used to to ignore things like layernorms, and embeddings

The logic for adding to targets:

  • layer must have sparsity > threshold (set to 20% for now)
  • layer must follow the sparsity structure defined

Otherwise it's added to the ignore list

Note: We also perform a reduction of the targets, and ignore list to make them cleaner and easily readable inside config.json

The compression_config inside config.json will now look like:

"compression_config": {
    "sparsity_config": {
      "format": "sparse-bitmask",
      "global_sparsity": 0.12233772669884009,
      "ignore": [
        "Embedding",
        "lm_head",
        "LlamaRotaryEmbedding",
        "SiLU",
        "LlamaRMSNorm"
      ],
      "registry_requires_subclass": false,
      "sparsity_structure": "2:4",
      "targets": [
        "Linear"
      ]
    }
  },

After review comments we were able to further simplify compression_config, now we only include modules in ignore list if they are targetted, now the compression config looks like:

  "compression_config": {
    "sparsity_config": {
      "format": "dense",
      "global_sparsity": 0.12233772669884009,
      "ignore": [
        "lm_head"
      ],
      "registry_requires_subclass": false,
      "sparsity_structure": "2:4",
      "targets": [
        "Linear"
      ]
    }
  }

Note: relies on neuralmagic/compressed-tensors#159

Copy link

👋 Hi! Thank you for contributing to llm-compressor. Please add the ready label when the PR is ready for review.

kylesayrs
kylesayrs previously approved these changes Sep 20, 2024
Copy link
Collaborator

@kylesayrs kylesayrs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! I'd prefer to call "layers" modules, since layers is already used to describe transformer blocks, but I think readers still understand.

@rahul-tuli
Copy link
Collaborator Author

Looks good! I'd prefer to call "layers" modules, since layers is already used to describe transformer blocks, but I think readers still understand.

Let's be consistent and use module/submodule over layer wherever possible

kylesayrs
kylesayrs previously approved these changes Sep 20, 2024
@rahul-tuli rahul-tuli force-pushed the add-ignore-and-targets-for-compression branch from 586d8dd to 5e00111 Compare September 23, 2024 14:01
horheynm
horheynm previously approved these changes Sep 23, 2024
mgoin
mgoin previously approved these changes Sep 23, 2024
Copy link
Member

@mgoin mgoin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice LGTM

@rahul-tuli rahul-tuli dismissed stale reviews from horheynm and mgoin via fc8cf83 September 23, 2024 23:09
kylesayrs
kylesayrs previously approved these changes Sep 24, 2024
mgoin
mgoin previously approved these changes Sep 24, 2024
horheynm
horheynm previously approved these changes Sep 24, 2024
@dsikka
Copy link
Collaborator

dsikka commented Sep 24, 2024

@rahul-tuli The Sparsification tests are failing with a different error than the shared memory issue we've been seeing. Can you look into it?

@rahul-tuli rahul-tuli dismissed stale reviews from horheynm, mgoin, and kylesayrs via edd1e43 September 24, 2024 14:23
@rahul-tuli
Copy link
Collaborator Author

@rahul-tuli The Sparsification tests are failing with a different error than the shared memory issue we've been seeing. Can you look into it?

Fixed!

@dsikka dsikka merged commit 7d5a0b6 into main Sep 24, 2024
6 of 7 checks passed
@dsikka dsikka deleted the add-ignore-and-targets-for-compression branch September 24, 2024 15:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants