Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Callbacks] Remove pre_initialize_structure #1160

Merged
merged 29 commits into from
Feb 26, 2025

Conversation

kylesayrs
Copy link
Collaborator

@kylesayrs kylesayrs commented Feb 17, 2025

Purpose

  • Remove pre_initialize_structure to simplify codebase
  • Fix recipe appending for appending a recipe to a model which already has a recipe
  • Remove misleading logging messages
2025-02-17T17:48:38.477750-0500 | _check_create_state | INFO - State created for compression lifecycle
2025-02-17T17:48:38.478670-0500 | pre_initialize_structure | INFO - Compression lifecycle structure pre-initialized for 0 modifiers
2025-02-17T17:48:38.478836-0500 | pre_initialize_structure | INFO - Compression lifecycle structure pre-initialized for 0 modifiers

Prerequisites

Follow-ups

  • Remove double initialization

Changes

The preinitialization step used to fulfill a few purposes

  • Construct the lifecycle state
    • This is now done by the dataclass directly
- state: Optional[State] = None
+ state: Optional[State] = field(default_factory=State)
  • Populate state with model and recipe
    • This is now done (and has always been done) by initialize
    • Some functions such as Trainer.init_model attempt to access the model through the session before initialize is called. In these cases, we can pass the model directly
trainer = Trainer(
-     model_init=get_session_model,
+     model_init=lambda: model,
  • Prepend recipes to the recipe.yaml if the model has already been compressed once
    • Move this logic from preinitialization to the save_pretrained function
    • Consolidate all save pathways to use the the same wrapped method
def save_pretrained_wrapper(...):
    update_and_save_recipe(model.name_or_path, save_directory)
  • Provide a way for modifiers to influence the model after they have already been applied

    • This can still be a enacted via recipe validation, but likely no longer has a use case and shouldn't be done automatically, at most the LLM Compressor should warn if the recipe configuration is invalid / requires modification
  • Create quantization modifier on GPTQ

    • This is now done within the on_initialize function
    • In the future, this should be done by a high-level recipe validation step
def on_initialize(...)
-     self.on_initialize_structure(state, **kwargs)
+     self._maybe_build_quant_modifier(state.model)
  • Remove EventType.order() method which is unused

  • Extend the Recipe.simplify_recipe class method to support strings

Lifecycle

  1. create_session() (doesn't do much and can be hidden behind initialize)
  2. initialize(model=..., recipe=...)
    1. Maybe start modifiers
  3. LifecycleCallback.event(...)
    1. Maybe start/end modifiers
  4. finalize()

Regression Evaluation

Main

vllm (pretrained=/home/kyle/llm-compressor/Meta-Llama-3-8B-Instruct-W4A16-G128,dtype=bfloat16,add_bos_token=True), gen_kwargs: (None), limit: None, num_fewshot: 5, batch_size: 1
|  Tasks   |Version|Filter|n-shot|Metric|   |Value |   |Stderr|
|----------|------:|------|-----:|------|---|-----:|---|-----:|
|winogrande|      1|none  |     5|acc   |↑  |0.7482|±  |0.0122|

This branch

vllm (pretrained=/home/kyle/llm-compressor/Meta-Llama-3-8B-Instruct-W4A16-G128,dtype=bfloat16,add_bos_token=True), gen_kwargs: (None), limit: None, num_fewshot: 5, batch_size: 1
|  Tasks   |Version|Filter|n-shot|Metric|   |Value |   |Stderr|
|----------|------:|------|-----:|------|---|-----:|---|-----:|
|winogrande|      1|none  |     5|acc   |↑  |0.7482|±  |0.0122|

Copy link

👋 Hi! Thank you for contributing to llm-compressor. Please add the ready label when the PR is ready for review.

Note: This is required to complete the testing suite, please only add the label once the PR is code complete and local testing has been performed.

@kylesayrs kylesayrs marked this pull request as draft February 17, 2025 19:29
@kylesayrs kylesayrs added the ready When a PR is ready for review label Feb 17, 2025
@kylesayrs kylesayrs marked this pull request as ready for review February 17, 2025 20:47
@kylesayrs kylesayrs self-assigned this Feb 17, 2025
@kylesayrs kylesayrs marked this pull request as draft February 17, 2025 23:28
@kylesayrs kylesayrs removed the ready When a PR is ready for review label Feb 18, 2025
Signed-off-by: Kyle Sayers <[email protected]>
Signed-off-by: Kyle Sayers <[email protected]>
@kylesayrs kylesayrs added the ready When a PR is ready for review label Feb 18, 2025
Signed-off-by: Kyle Sayers <[email protected]>
Signed-off-by: Kyle Sayers <[email protected]>
Signed-off-by: Kyle Sayers <[email protected]>
Signed-off-by: Kyle Sayers <[email protected]>
@kylesayrs kylesayrs marked this pull request as ready for review February 18, 2025 05:09
@kylesayrs kylesayrs changed the base branch from main to kylesayrs/consolidate-saving February 18, 2025 21:18
Signed-off-by: Kyle Sayers <[email protected]>
Signed-off-by: Kyle Sayers <[email protected]>
horheynm
horheynm previously approved these changes Feb 20, 2025
Copy link
Collaborator

@horheynm horheynm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good job XD

dsikka added a commit that referenced this pull request Feb 25, 2025
## Purpose ##
* Simplify all methods of saving into one point, namely the wrapped
`save_pretrained` function
* Precursor to #1160
* Needed for having a single point for saving on top of existing recipes

## Background ## 
All the things needed to be done during saving
1. Save the model weights, potentially compressed
2. Save the processor
3. Update the recipe checkpoint
4. Copy any necessary python files from the model cache
5. Only save on the main process

After these changes, (1, 2, 3, 4) will be done within the
`save_pretrained` function, and (5) will be the responsibility of the
caller. (3) will be implemented by #1160 so as not to conflict with
existing logic in pre_init

All of the places where a model is saved are
* If an output dir is specified, at the end of the main function
* Between stages of the stage runner
* Between epochs of the HF Trainer
* By the user after oneshot/training completes

After these changes, all of these will be replaced by a single
`save_checkpoint` function which calls `save_pretrained` to do all the
necessary things

## Changes ##
* Remove `save_model_and_recipe`
  * Saving recipes is now done by `save_pretrained` function
* Implement `save_checkpoint`
  * Single entrypoint for saving a model and its processor
  * Performs actions (1, 2, 4)
* Replace all locations where a model is saved with `save_checkpoint`
  * All applicable callers with only saving on the main process (5)
* Remove support for `modify_fsdp_model_save_pretrained` and
`unwrap_and_export_model`, to be added back in a future release

---------

Signed-off-by: Kyle Sayers <[email protected]>
Co-authored-by: Dipika Sikka <[email protected]>
Base automatically changed from kylesayrs/consolidate-saving to main February 25, 2025 15:46
@dsikka dsikka dismissed horheynm’s stale review February 25, 2025 15:46

The base branch was changed.

Copy link
Collaborator

@brian-dellabetta brian-dellabetta left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looking good, but i'm going off an assumption that none of this deleted code we actually need

@kylesayrs
Copy link
Collaborator Author

kylesayrs commented Feb 25, 2025

@brian-dellabetta The PR description lists all of the functionality that was needed by preinitialize and where that functionality now lives

Signed-off-by: Kyle Sayers <[email protected]>
Signed-off-by: Kyle Sayers <[email protected]>
Signed-off-by: Kyle Sayers <[email protected]>
Copy link
Collaborator

@dsikka dsikka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

great work!

@dsikka dsikka merged commit a88b72b into main Feb 26, 2025
7 checks passed
@dsikka dsikka deleted the kylesayrs/remove-preinitialize-structure branch February 26, 2025 20:37
kylesayrs added a commit that referenced this pull request Mar 11, 2025
## Purpose ##
* Fix staged 2of4 example

## Background ##
* When #1160 landed, this change introduced a bug in the recipe
container which meant that the recipe was not recompiled after
`append`ing. This caused sgpt to initialize twice and gptq to never
initialize, leading to a sparsity-only quantization config
* At some point, a changed was introduced which causes previous stages
to become reconstructed after recipe recompilation. This means that
without resetting the session in between stages, previous stages will
initialize twice.
* In order to avoid this issue, this PR introduces `session.reset()` in
between stages
* This change has the consequence of creating `recipe.yaml` files which
do not have the full recipe history. However, I believe this is
acceptable for the time being, as the stage runner and this work flow
will be removed in the next release.

---------

Signed-off-by: Kyle Sayers <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ready When a PR is ready for review
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants