Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fixing reproducibility of lmeval tests #1220

Merged
merged 13 commits into from
Mar 11, 2025
Merged

Conversation

brian-dellabetta
Copy link
Collaborator

@brian-dellabetta brian-dellabetta commented Mar 4, 2025

SUMMARY:
LM Eval weekly tests are failing, this resolves two issues

  1. installs pillow, which I had locally through vllm but is not installed as part of llm-compressor
  2. adds a random seed to the lmeval tests, which seems after a good amount of testing to resolve the issue. it is entirely during calibration/quantization, lm-eval behavior is deterministic as they always set a seed. It is a bit surprising that it can have such a drastic effect, but these are 2B vision-language models and a difficult multiple choice dataset, not too far away from random guessing.

TEST PLAN:
no new src code

Copy link

github-actions bot commented Mar 4, 2025

👋 Hi! Thank you for contributing to llm-compressor. Please add the ready label when the PR is ready for review.

Note: This is required to complete the testing suite, please only add the label once the PR is code complete and local testing has been performed.

dsikka
dsikka previously approved these changes Mar 4, 2025
@dsikka dsikka marked this pull request as ready for review March 4, 2025 01:25
@dsikka dsikka added the ready When a PR is ready for review label Mar 4, 2025
@brian-dellabetta brian-dellabetta force-pushed the bdellabe/lmeval-test-bugfixes branch from 87dcc4c to 1f2ce00 Compare March 5, 2025 22:40
@brian-dellabetta brian-dellabetta requested a review from dsikka March 5, 2025 22:40
@brian-dellabetta brian-dellabetta changed the title attempts to fix lmeval failing tests fixing reprdocubility of lmeval tests Mar 5, 2025
@brian-dellabetta brian-dellabetta force-pushed the bdellabe/lmeval-test-bugfixes branch from 3429cc7 to b91cd7b Compare March 7, 2025 20:00
horheynm
horheynm previously approved these changes Mar 7, 2025
rahul-tuli
rahul-tuli previously approved these changes Mar 7, 2025
Copy link
Collaborator

@rahul-tuli rahul-tuli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GG!

@dsikka dsikka enabled auto-merge (squash) March 7, 2025 20:10
Copy link
Collaborator

@dsikka dsikka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rebase?

@brian-dellabetta brian-dellabetta changed the title fixing reprdocubility of lmeval tests fixing reproducibility of lmeval tests Mar 7, 2025
@brian-dellabetta brian-dellabetta requested a review from dsikka March 7, 2025 20:25
@brian-dellabetta brian-dellabetta dismissed stale reviews from rahul-tuli and horheynm via 2d2e220 March 7, 2025 20:27
@brian-dellabetta brian-dellabetta force-pushed the bdellabe/lmeval-test-bugfixes branch from b91cd7b to 2d2e220 Compare March 7, 2025 20:27
kylesayrs
kylesayrs previously approved these changes Mar 7, 2025
aman2304 and others added 6 commits March 10, 2025 22:30
SUMMARY:
Fixed logging and clear loggers enabling/disabling bug. Previously, any
value on the right environment variables would disable logging. Now, we
explicitly check for `true`

TEST PLAN:
Added unit tests for enabling logging.
`make test` passes

---------

Signed-off-by: Aman Gupta <[email protected]>
Co-authored-by: Dipika Sikka <[email protected]>
Signed-off-by: Brian Dellabetta <[email protected]>
…ot (#1212)

Order of reviews:
#1206
#1207
#1209
#1212  <-- Here
#1214

SUMMARY:
* Move the preprocessing and postprocessing logic out of
`src/llmcompressor/transformers/finetune/text_generation.py` and into
`src/llmcompressor/entrypoints/utils.py`

TEST PLAN:
Pass tests

Signed-off-by: Brian Dellabetta <[email protected]>
SUMMARY:
Current README shows which algo we support + how to run. However, to a
user it is still hard to understand when to use which. Add more info on
based on the users use-case and hardware the optimization to apply.

TEST PLAN:
N/A

Signed-off-by: Brian Dellabetta <[email protected]>
## Purpose ##
* Simplify the modifier lifecycle by removing the ability for modifiers
to affect the model after the modifier's `end` event
* This allows the `on_event` method to be removed in a future change

## Background ##
* The `leave_enabled` option was originally intended as a shortcut to
simplify recipes which used magnitude pruning during the iterative
pruning, then needed the masks to stay enabled during stabilization SFT
* This change proposes making the recipe clearer by requiring the
ConstantPruningModifier after the MagnitudePruningModifier becomes
inactive

## Changes ##
* Remove `MagnitudePruningModifier.leave_enabled` with a deprecation
warning

Signed-off-by: Kyle Sayers <[email protected]>
Signed-off-by: Brian Dellabetta <[email protected]>
brian-dellabetta and others added 4 commits March 10, 2025 22:30
Signed-off-by: Brian Dellabetta <[email protected]>
Co-authored-by: Kyle Sayers <[email protected]>
Signed-off-by: Brian Dellabetta <[email protected]>
## Purpose ##
* Fix tests which break as a result of
https://github.com/vllm-project/llm-compressor/pull/new/kylesayrs/replace-self-hosted

## Changes ##
* Use self-hosted model stub

Signed-off-by: Kyle Sayers <[email protected]>
Signed-off-by: Brian Dellabetta <[email protected]>
@brian-dellabetta brian-dellabetta force-pushed the bdellabe/lmeval-test-bugfixes branch from fffabf0 to 341299f Compare March 10, 2025 22:30
@dsikka dsikka merged commit c1fe865 into main Mar 11, 2025
8 checks passed
@dsikka dsikka deleted the bdellabe/lmeval-test-bugfixes branch March 11, 2025 14:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ready When a PR is ready for review
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants