-
Notifications
You must be signed in to change notification settings - Fork 96
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add: targets and ignore inference for sparse compression #191
Conversation
👋 Hi! Thank you for contributing to llm-compressor. Please add the ready label when the PR is ready for review. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good! I'd prefer to call "layers" modules, since layers is already used to describe transformer blocks, but I think readers still understand.
Let's be consistent and use |
586d8dd
to
5e00111
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice LGTM
@rahul-tuli The Sparsification tests are failing with a different error than the shared memory issue we've been seeing. Can you look into it? |
edd1e43
Fixed! |
This PR adds support to automatically infer targets and ignore lists for sparse compression
The lists thus generated will be used to to ignore things like layernorms, and embeddings
The logic for adding to targets:
Otherwise it's added to the ignore list
Note: We also perform a reduction of the targets, and ignore list to make them cleaner and easily readable inside
config.json
The compression_config inside
config.json
will now look like:After review comments we were able to further simplify compression_config, now we only include modules in ignore list if they are targetted, now the compression config looks like:
Note: relies on neuralmagic/compressed-tensors#159