[Callbacks] Consolidate Saving Methods #1168

kylesayrs · 2025-02-18T20:33:31Z

Purpose

Simplify all methods of saving into one point, namely the wrapped save_pretrained function
Precursor to [Callbacks] Remove pre_initialize_structure #1160
- Needed for having a single point for saving on top of existing recipes

Background

All the things needed to be done during saving

Save the model weights, potentially compressed
Save the processor
Update the recipe checkpoint
Copy any necessary python files from the model cache
Only save on the main process

After these changes, (1, 2, 3, 4) will be done within the save_pretrained function, and (5) will be the responsibility of the caller. (3) will be implemented by #1160 so as not to conflict with existing logic in pre_init

All of the places where a model is saved are

If an output dir is specified, at the end of the main function
Between stages of the stage runner
Between epochs of the HF Trainer
By the user after oneshot/training completes

After these changes, all of these will be replaced by a single save_checkpoint function which calls save_pretrained to do all the necessary things

Changes

Remove save_model_and_recipe
- Saving recipes is now done by save_pretrained function
Implement save_checkpoint
- Single entrypoint for saving a model and its processor
- Performs actions (1, 2, 4)
Replace all locations where a model is saved with save_checkpoint
- All applicable callers with only saving on the main process (5)
Remove support for modify_fsdp_model_save_pretrained and unwrap_and_export_model, to be added back in a future release

Signed-off-by: Kyle Sayers <[email protected]>

github-actions · 2025-02-18T20:33:43Z

👋 Hi! Thank you for contributing to llm-compressor. Please add the ready label when the PR is ready for review.

Note: This is required to complete the testing suite, please only add the label once the PR is code complete and local testing has been performed.

Signed-off-by: Kyle Sayers <[email protected]>

src/llmcompressor/transformers/sparsification/compressed_tensors_utils.py

horheynm

Having one central location to carry out saving logic sounds great!

Could you map when the current saving logic is carried out using which pathway, and how the new changes takes over the saving logic? Ex. when do the many different saving logic now gets carried out?

For FSDP, we do support it currently. Once stage runner is removed, then the assumption that any oneshot pathway will not have fsdp support will be valid.

src/llmcompressor/transformers/finetune/text_generation.py

kylesayrs · 2025-02-21T00:33:28Z

@horheynm All the answers to your questions are in the PR description. w.r.t. fsdp, it is not supported now but will be at a later date (soon).

kylesayrs · 2025-02-24T18:46:46Z

@dsikka This is ready for merge

dsikka · 2025-02-25T02:00:56Z

Can you fix conflicts

horheynm

Nice!

src/llmcompressor/transformers/sparsification/compressed_tensors_utils.py

dsikka

LGTM.
One clarification question.

## Purpose ## * Remove pre_initialize_structure to simplify codebase * Fix recipe appending for appending a recipe to a model which already has a recipe * Remove misleading logging messages ``` 2025-02-17T17:48:38.477750-0500 | _check_create_state | INFO - State created for compression lifecycle 2025-02-17T17:48:38.478670-0500 | pre_initialize_structure | INFO - Compression lifecycle structure pre-initialized for 0 modifiers 2025-02-17T17:48:38.478836-0500 | pre_initialize_structure | INFO - Compression lifecycle structure pre-initialized for 0 modifiers ``` ## Prerequisites ## * #1168 ## Follow-ups ## * Remove double initialization ## Changes ## The preinitialization step used to fulfill a few purposes * Construct the lifecycle state * This is now done by the dataclass directly ```python3 - state: Optional[State] = None + state: Optional[State] = field(default_factory=State) ``` * Populate state with model and recipe * This is now done (and has always been done) by `initialize` * Some functions such as Trainer.init_model attempt to access the model through the session before `initialize` is called. In these cases, we can pass the model directly ```python3 trainer = Trainer( - model_init=get_session_model, + model_init=lambda: model, ``` * Prepend recipes to the recipe.yaml if the model has already been compressed once * Move this logic from preinitialization to the save_pretrained function * Consolidate all save pathways to use the the same wrapped method ```python3 def save_pretrained_wrapper(...): update_and_save_recipe(model.name_or_path, save_directory) ``` * Provide a way for modifiers to influence the model after they have already been applied * This can still be a enacted via recipe validation, but likely no longer has a use case and shouldn't be done automatically, at most the LLM Compressor should warn if the recipe configuration is invalid / requires modification * Create quantization modifier on GPTQ * This is now done within the `on_initialize` function * In the future, this should be done by a high-level recipe validation step ```python3 def on_initialize(...) - self.on_initialize_structure(state, **kwargs) + self._maybe_build_quant_modifier(state.model) ```` * Remove `EventType.order()` method which is unused * Extend the `Recipe.simplify_recipe` class method to support strings ## Lifecycle ## 1. `create_session()` (doesn't do much and can be hidden behind `initialize`) 2. `initialize(model=..., recipe=...)` 1. Maybe `start` modifiers 3. `LifecycleCallback.event(...)` 1. Maybe `start/end` modifiers 4. `finalize()` ## Regression Evaluation ## Main ``` vllm (pretrained=/home/kyle/llm-compressor/Meta-Llama-3-8B-Instruct-W4A16-G128,dtype=bfloat16,add_bos_token=True), gen_kwargs: (None), limit: None, num_fewshot: 5, batch_size: 1 | Tasks |Version|Filter|n-shot|Metric| |Value | |Stderr| |----------|------:|------|-----:|------|---|-----:|---|-----:| |winogrande| 1|none | 5|acc |↑ |0.7482|± |0.0122| ``` This branch ``` vllm (pretrained=/home/kyle/llm-compressor/Meta-Llama-3-8B-Instruct-W4A16-G128,dtype=bfloat16,add_bos_token=True), gen_kwargs: (None), limit: None, num_fewshot: 5, batch_size: 1 | Tasks |Version|Filter|n-shot|Metric| |Value | |Stderr| |----------|------:|------|-----:|------|---|-----:|---|-----:| |winogrande| 1|none | 5|acc |↑ |0.7482|± |0.0122| ``` --------- Signed-off-by: Kyle Sayers <[email protected]>

consolidate saving paths

bdc4fa5

Signed-off-by: Kyle Sayers <[email protected]>

kylesayrs added 4 commits February 18, 2025 15:47

remove broken import

a83b0aa

Signed-off-by: Kyle Sayers <[email protected]>

Merge remote-tracking branch 'origin' into kylesayrs/consolidate-saving

4efd116

add back def

b9f0bd1

Signed-off-by: Kyle Sayers <[email protected]>

save state

0a2642b

Signed-off-by: Kyle Sayers <[email protected]>

kylesayrs added the ready When a PR is ready for review label Feb 18, 2025

brian-dellabetta reviewed Feb 18, 2025

View reviewed changes

src/llmcompressor/transformers/sparsification/compressed_tensors_utils.py Show resolved Hide resolved

kylesayrs mentioned this pull request Feb 19, 2025

[Callbacks] Remove pre_initialize_structure #1160

Merged

kylesayrs changed the title ~~Consolidate Saving Methods~~ [Callbacks] Consolidate Saving Methods Feb 19, 2025

kylesayrs marked this pull request as ready for review February 19, 2025 00:23

kylesayrs self-assigned this Feb 19, 2025

brian-dellabetta previously approved these changes Feb 19, 2025

View reviewed changes

src/llmcompressor/transformers/sparsification/compressed_tensors_utils.py Show resolved Hide resolved

horheynm requested changes Feb 20, 2025

View reviewed changes

src/llmcompressor/transformers/finetune/text_generation.py Show resolved Hide resolved

horheynm previously approved these changes Feb 24, 2025

View reviewed changes

Merge branch 'main' into kylesayrs/consolidate-saving

ecc47de

Merge remote-tracking branch 'origin' into kylesayrs/consolidate-saving

f2411ed

kylesayrs dismissed stale reviews from brian-dellabetta and horheynm via f2411ed February 24, 2025 21:37

brian-dellabetta previously approved these changes Feb 24, 2025

View reviewed changes

Merge remote-tracking branch 'origin' into kylesayrs/consolidate-saving

dfd7e05

kylesayrs dismissed brian-dellabetta’s stale review via dfd7e05 February 24, 2025 22:20

kylesayrs and others added 2 commits February 24, 2025 22:24

Merge remote-tracking branch 'origin' into kylesayrs/consolidate-saving

4e623f0

Merge branch 'main' into kylesayrs/consolidate-saving

1a590d3

horheynm approved these changes Feb 25, 2025

View reviewed changes

dsikka reviewed Feb 25, 2025

View reviewed changes

src/llmcompressor/transformers/sparsification/compressed_tensors_utils.py Show resolved Hide resolved

dsikka approved these changes Feb 25, 2025

View reviewed changes

brian-dellabetta approved these changes Feb 25, 2025

View reviewed changes

dsikka merged commit 6e101b2 into main Feb 25, 2025
7 checks passed

dsikka deleted the kylesayrs/consolidate-saving branch February 25, 2025 15:46

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Callbacks] Consolidate Saving Methods #1168

[Callbacks] Consolidate Saving Methods #1168

kylesayrs commented Feb 18, 2025 •

edited

Loading

github-actions bot commented Feb 18, 2025

horheynm left a comment

kylesayrs commented Feb 21, 2025 •

edited

Loading

kylesayrs commented Feb 24, 2025

dsikka commented Feb 25, 2025

horheynm left a comment

dsikka left a comment

[Callbacks] Consolidate Saving Methods #1168

[Callbacks] Consolidate Saving Methods #1168

Conversation

kylesayrs commented Feb 18, 2025 • edited Loading

Purpose

Background

Changes

github-actions bot commented Feb 18, 2025

horheynm left a comment

Choose a reason for hiding this comment

kylesayrs commented Feb 21, 2025 • edited Loading

kylesayrs commented Feb 24, 2025

dsikka commented Feb 25, 2025

horheynm left a comment

Choose a reason for hiding this comment

dsikka left a comment

Choose a reason for hiding this comment

kylesayrs commented Feb 18, 2025 •

edited

Loading

kylesayrs commented Feb 21, 2025 •

edited

Loading