-
Notifications
You must be signed in to change notification settings - Fork 94
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Callbacks] Consolidate Saving Methods #1168
Conversation
Signed-off-by: Kyle Sayers <[email protected]>
👋 Hi! Thank you for contributing to llm-compressor. Please add the ready label when the PR is ready for review. Note: This is required to complete the testing suite, please only add the label once the PR is code complete and local testing has been performed. |
Signed-off-by: Kyle Sayers <[email protected]>
Signed-off-by: Kyle Sayers <[email protected]>
Signed-off-by: Kyle Sayers <[email protected]>
src/llmcompressor/transformers/sparsification/compressed_tensors_utils.py
Show resolved
Hide resolved
src/llmcompressor/transformers/sparsification/compressed_tensors_utils.py
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Having one central location to carry out saving logic sounds great!
Could you map when the current saving logic is carried out using which pathway, and how the new changes takes over the saving logic? Ex. when do the many different saving logic now gets carried out?
For FSDP, we do support it currently. Once stage runner is removed, then the assumption that any oneshot pathway will not have fsdp support will be valid.
@horheynm All the answers to your questions are in the PR description. w.r.t. fsdp, it is not supported now but will be at a later date (soon). |
@dsikka This is ready for merge |
f2411ed
Can you fix conflicts |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice!
src/llmcompressor/transformers/sparsification/compressed_tensors_utils.py
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
One clarification question.
## Purpose ## * Remove pre_initialize_structure to simplify codebase * Fix recipe appending for appending a recipe to a model which already has a recipe * Remove misleading logging messages ``` 2025-02-17T17:48:38.477750-0500 | _check_create_state | INFO - State created for compression lifecycle 2025-02-17T17:48:38.478670-0500 | pre_initialize_structure | INFO - Compression lifecycle structure pre-initialized for 0 modifiers 2025-02-17T17:48:38.478836-0500 | pre_initialize_structure | INFO - Compression lifecycle structure pre-initialized for 0 modifiers ``` ## Prerequisites ## * #1168 ## Follow-ups ## * Remove double initialization ## Changes ## The preinitialization step used to fulfill a few purposes * Construct the lifecycle state * This is now done by the dataclass directly ```python3 - state: Optional[State] = None + state: Optional[State] = field(default_factory=State) ``` * Populate state with model and recipe * This is now done (and has always been done) by `initialize` * Some functions such as Trainer.init_model attempt to access the model through the session before `initialize` is called. In these cases, we can pass the model directly ```python3 trainer = Trainer( - model_init=get_session_model, + model_init=lambda: model, ``` * Prepend recipes to the recipe.yaml if the model has already been compressed once * Move this logic from preinitialization to the save_pretrained function * Consolidate all save pathways to use the the same wrapped method ```python3 def save_pretrained_wrapper(...): update_and_save_recipe(model.name_or_path, save_directory) ``` * Provide a way for modifiers to influence the model after they have already been applied * This can still be a enacted via recipe validation, but likely no longer has a use case and shouldn't be done automatically, at most the LLM Compressor should warn if the recipe configuration is invalid / requires modification * Create quantization modifier on GPTQ * This is now done within the `on_initialize` function * In the future, this should be done by a high-level recipe validation step ```python3 def on_initialize(...) - self.on_initialize_structure(state, **kwargs) + self._maybe_build_quant_modifier(state.model) ```` * Remove `EventType.order()` method which is unused * Extend the `Recipe.simplify_recipe` class method to support strings ## Lifecycle ## 1. `create_session()` (doesn't do much and can be hidden behind `initialize`) 2. `initialize(model=..., recipe=...)` 1. Maybe `start` modifiers 3. `LifecycleCallback.event(...)` 1. Maybe `start/end` modifiers 4. `finalize()` ## Regression Evaluation ## Main ``` vllm (pretrained=/home/kyle/llm-compressor/Meta-Llama-3-8B-Instruct-W4A16-G128,dtype=bfloat16,add_bos_token=True), gen_kwargs: (None), limit: None, num_fewshot: 5, batch_size: 1 | Tasks |Version|Filter|n-shot|Metric| |Value | |Stderr| |----------|------:|------|-----:|------|---|-----:|---|-----:| |winogrande| 1|none | 5|acc |↑ |0.7482|± |0.0122| ``` This branch ``` vllm (pretrained=/home/kyle/llm-compressor/Meta-Llama-3-8B-Instruct-W4A16-G128,dtype=bfloat16,add_bos_token=True), gen_kwargs: (None), limit: None, num_fewshot: 5, batch_size: 1 | Tasks |Version|Filter|n-shot|Metric| |Value | |Stderr| |----------|------:|------|-----:|------|---|-----:|---|-----:| |winogrande| 1|none | 5|acc |↑ |0.7482|± |0.0122| ``` --------- Signed-off-by: Kyle Sayers <[email protected]>
Purpose
save_pretrained
functionBackground
All the things needed to be done during saving
After these changes, (1, 2, 3, 4) will be done within the
save_pretrained
function, and (5) will be the responsibility of the caller. (3) will be implemented by #1160 so as not to conflict with existing logic in pre_initAll of the places where a model is saved are
After these changes, all of these will be replaced by a single
save_checkpoint
function which callssave_pretrained
to do all the necessary thingsChanges
save_model_and_recipe
save_pretrained
functionsave_checkpoint
save_checkpoint
modify_fsdp_model_save_pretrained
andunwrap_and_export_model
, to be added back in a future release