Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bugfix] SparseGPT, Pipelines #1130

Merged
merged 15 commits into from
Feb 11, 2025
Merged

[Bugfix] SparseGPT, Pipelines #1130

merged 15 commits into from
Feb 11, 2025

Conversation

kylesayrs
Copy link
Collaborator

@kylesayrs kylesayrs commented Feb 7, 2025

Purpose

Changes

  • SparseGPT
    • Fully separate targets and sequential_targets
      • Modify hooks adding logic to reflect this change
    • Fix behavior of _infer_owl_layer_sparsity and add test
    • Code clarity
      • Add additional type hints
      • Designate calibrate_module as an abstract method on the sgpt mixin
  • Pipelines
    • Sequential pipeline: unwrap model forward function to avoid issues with pytorch function patching
    • Layer Sequential Pipeline: Add maybe_inject_pos_embeddings to sequential pipeline to hackily support models with position_embeddings
    • Basic Pipeline: Fix on_sequential_batch_end to call on the end of epoch, rather than every batch
      • Calling every batch was likely causing slowdowns

Followups

  • Remove deprecated sequential_update option from examples and tests

Testing

  • Added tests/llmcompressor/transformers/obcq/test_obcq_owl.py
  • Tested OBCQ+llama with sequential, layer sequential, and basic pipelines independently

Regression Evaluations

Models were compressed using examples/sparse_2of4_quantization_fp8/llama3_8b_2of4.py without fp8 option

sparsegpt

Main

vllm (pretrained=/home/kyle/llm-compressor/Meta-Llama-3-8B-InstructSparseGPTModifierMAIN,dtype=bfloat16,add_bos_token=True), gen_kwargs: (None), limit: None, num_fewshot: 5, batch_size: 1
|  Tasks   |Version|Filter|n-shot|Metric|   |Value |   |Stderr|
|----------|------:|------|-----:|------|---|-----:|---|-----:|
|winogrande|      1|none  |     5|acc   |↑  |0.6243|±  |0.0136|

This branch

vllm (pretrained=/home/kyle/llm-compressor/Meta-Llama-3-8B-InstructSparseGPTModifierFEATURE,dtype=bfloat16,add_bos_token=True), gen_kwargs: (None), limit: None, num_fewshot: 5, batch_size: 1
|  Tasks   |Version|Filter|n-shot|Metric|   |Value |   |Stderr|
|----------|------:|------|-----:|------|---|-----:|---|-----:|
|winogrande|      1|none  |     5|acc   |↑  |0.6306|±  |0.0136|

To test wanda, the SparseGPTModifier was replaced with the WandaPruningModifier

wanda

Main

vllm (pretrained=/home/kyle/llm-compressor/Meta-Llama-3-8B-InstructWandaPruningModifierMAIN,dtype=bfloat16,add_bos_token=True), gen_kwargs: (None), limit: None, num_fewshot: 5, batch_size: 1
|  Tasks   |Version|Filter|n-shot|Metric|   |Value |   |Stderr|
|----------|------:|------|-----:|------|---|-----:|---|-----:|
|winogrande|      1|none  |     5|acc   |↑  |0.5912|±  |0.0138|

This branch

vllm (pretrained=/home/kyle/llm-compressor/Meta-Llama-3-8B-InstructWandaPruningModifierFEATURE,dtype=bfloat16,add_bos_token=True), gen_kwargs: (None), limit: None, num_fewshot: 5, batch_size: 1
|  Tasks   |Version|Filter|n-shot|Metric|   |Value |   |Stderr|
|----------|------:|------|-----:|------|---|-----:|---|-----:|
|winogrande|      1|none  |     5|acc   |↑  |0.5817|±  |0.0139|

Signed-off-by: Kyle Sayers <[email protected]>
@kylesayrs kylesayrs marked this pull request as draft February 7, 2025 20:23
Copy link

github-actions bot commented Feb 7, 2025

👋 Hi! Thank you for contributing to llm-compressor. Please add the ready label when the PR is ready for review.

Note: This is required to complete the testing suite, please only add the label once the PR is code complete and local testing has been performed.

Signed-off-by: Kyle Sayers <[email protected]>
Signed-off-by: Kyle Sayers <[email protected]>
@kylesayrs kylesayrs marked this pull request as ready for review February 7, 2025 23:48
Copy link
Collaborator

@dsikka dsikka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't agree with the notion that adding the same variables through inheritance helps with code clarity or user readability.

Updating the docstring in the inherited classes would be sufficient rather than relying on redundant code.

src/llmcompressor/pipelines/basic/pipeline.py Show resolved Hide resolved
src/llmcompressor/modifiers/obcq/base.py Outdated Show resolved Hide resolved
src/llmcompressor/modifiers/pruning/wanda/base.py Outdated Show resolved Hide resolved
src/llmcompressor/utils/helpers.py Show resolved Hide resolved
@dsikka dsikka requested a review from rahul-tuli February 9, 2025 20:06
@kylesayrs kylesayrs added the ready When a PR is ready for review label Feb 9, 2025
@kylesayrs kylesayrs requested a review from dsikka February 9, 2025 21:31
Signed-off-by: Kyle Sayers <[email protected]>
Copy link
Collaborator

@rahul-tuli rahul-tuli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! (pending tests)

@dsikka dsikka enabled auto-merge (squash) February 11, 2025 01:21
@dsikka dsikka merged commit b55ec42 into main Feb 11, 2025
7 checks passed
@dsikka dsikka deleted the kylesayrs/fix-sgpt-targets branch February 11, 2025 02:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ready When a PR is ready for review
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants