-
Notifications
You must be signed in to change notification settings - Fork 96
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove SparseAutoModelForCausalLM #832
Conversation
👋 Hi! Thank you for contributing to llm-compressor. Please add the ready label when the PR is ready for review. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you provide examples on different loading scenarios (previously quantized models/legacy compression_config models) to understand how the interface should work?
In general I think we still have to make updates in the following areas:
- Save pretrained should only be responsible for saving things to disk, the ModelCompressor pathway should be outside of this
- Seems like we're still using the SparseAutoModel pathway in a couple of places?
- Is there anymore clean-up we can do in loading/saving?
- Confirm the loading we're doing works sufficiently for the work @kylesayrs is doing with adding modifier support for multi-modal cases
src/llmcompressor/transformers/sparsification/compressed_tensors_utils.py
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should update e2e/example testing and other ci testing as well
…roject/llm-compressor into remove-sparseAutoModelForCausalLM
With respect to the functionality, the PR is ready. Next to dos:
Strategy: after oneshot is called, call save_pretrained(SAVE_DIR), and extract the statedict from the model. Testing:
|
src/llmcompressor/transformers/sparsification/compressed_tensors_utils.py
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@horheynm There are new consistent test failures across the different test cases
…move-sparseAutoModelForCausalLM
tests/llmcompressor/transformers/finetune/test_oneshot_and_finetune.py
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- depreciate sparseautomodel
- add back saving to oneshot pathway, or at least remove argument and remove from examples
- remove breakpoint
…move-sparseAutoModelForCausalLM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you please update the PR description?
Adding in a complete summary of the functionality that was changed, why, and testing that was done to verify behaviour after the changes were applied. The current description is unclear and makes the code hard to review.
It also seems like the test cases are still failing with a new test failure.
…roject/llm-compressor into remove-sparseAutoModelForCausalLM
@horheynm I have a request, could we split this into two PRs:
It will make this PR much easier to review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code looks good! Did we run the examples again after latest changes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- subclass automodel in sparseautomodel or raise depreciation error
- add back saving to oneshot pathway, or at least remove argument from model_args and remove from examples
|
@horheynm On the first point, either choice is better than rather than raising a warning and then having the user encounter unexpected behavior or a nonsensical error. Either choice is a one line change. On the second point, if we are really choosing to depreciate saving in oneshot, can you add a value error which points to the proper way to save? As it stands, many users have their own scripts they use to compress, some of which do not save the model manually. With the new update, the compression will occur but oneshot will silently fail to save, leaving users wondering where the saved model went or why it was never saved. |
Should we not also depreciate |
I have broken this down into example changes and component changes, will close the PR |
SUMMARY:
Goal is to get rid of
SparseAutoModelForCausalLM
- users can still use but a deprecation warning pops upClass removal causes logic change in the saving.
Two main pathway of saving - fsdp and save_pretrained - both are wrapped on save_pretrained.
Other changes:
TEST PLAN:
baseline case:
vision:
bert: