fixing reproducibility of lmeval tests #1220

brian-dellabetta · 2025-03-04T01:23:37Z

SUMMARY:
LM Eval weekly tests are failing, this resolves two issues

installs pillow, which I had locally through vllm but is not installed as part of llm-compressor
adds a random seed to the lmeval tests, which seems after a good amount of testing to resolve the issue. it is entirely during calibration/quantization, lm-eval behavior is deterministic as they always set a seed. It is a bit surprising that it can have such a drastic effect, but these are 2B vision-language models and a difficult multiple choice dataset, not too far away from random guessing.

TEST PLAN:
no new src code

github-actions · 2025-03-04T01:23:48Z

👋 Hi! Thank you for contributing to llm-compressor. Please add the ready label when the PR is ready for review.

Note: This is required to complete the testing suite, please only add the label once the PR is code complete and local testing has been performed.

setup.py

tests/lmeval/configs/vl_fp8_dynamic_per_token.yaml

tests/lmeval/configs/vl_int8_w8a8_dynamic_per_token.yaml

tests/lmeval/test_lmeval.py

rahul-tuli

GG!

dsikka

rebase?

setup.py

tests/lmeval/configs/vl_fp8_dynamic_per_token.yaml

tests/lmeval/test_lmeval.py

SUMMARY: Fixed logging and clear loggers enabling/disabling bug. Previously, any value on the right environment variables would disable logging. Now, we explicitly check for `true` TEST PLAN: Added unit tests for enabling logging. `make test` passes --------- Signed-off-by: Aman Gupta <[email protected]> Co-authored-by: Dipika Sikka <[email protected]> Signed-off-by: Brian Dellabetta <[email protected]>

…ot (#1212) Order of reviews: #1206 #1207 #1209 #1212 <-- Here #1214 SUMMARY: * Move the preprocessing and postprocessing logic out of `src/llmcompressor/transformers/finetune/text_generation.py` and into `src/llmcompressor/entrypoints/utils.py` TEST PLAN: Pass tests Signed-off-by: Brian Dellabetta <[email protected]>

SUMMARY: Current README shows which algo we support + how to run. However, to a user it is still hard to understand when to use which. Add more info on based on the users use-case and hardware the optimization to apply. TEST PLAN: N/A Signed-off-by: Brian Dellabetta <[email protected]>

## Purpose ## * Simplify the modifier lifecycle by removing the ability for modifiers to affect the model after the modifier's `end` event * This allows the `on_event` method to be removed in a future change ## Background ## * The `leave_enabled` option was originally intended as a shortcut to simplify recipes which used magnitude pruning during the iterative pruning, then needed the masks to stay enabled during stabilization SFT * This change proposes making the recipe clearer by requiring the ConstantPruningModifier after the MagnitudePruningModifier becomes inactive ## Changes ## * Remove `MagnitudePruningModifier.leave_enabled` with a deprecation warning Signed-off-by: Kyle Sayers <[email protected]> Signed-off-by: Brian Dellabetta <[email protected]>

Signed-off-by: Brian Dellabetta <[email protected]>

Co-authored-by: Kyle Sayers <[email protected]> Signed-off-by: Brian Dellabetta <[email protected]>

## Purpose ## * Fix tests which break as a result of https://github.com/vllm-project/llm-compressor/pull/new/kylesayrs/replace-self-hosted ## Changes ## * Use self-hosted model stub Signed-off-by: Kyle Sayers <[email protected]> Signed-off-by: Brian Dellabetta <[email protected]>

setup.py

brian-dellabetta requested review from dsikka and dhuangnm March 4, 2025 01:23

brian-dellabetta commented Mar 4, 2025

View reviewed changes

setup.py Show resolved Hide resolved

dsikka previously approved these changes Mar 4, 2025

View reviewed changes

dsikka marked this pull request as ready for review March 4, 2025 01:25

dsikka added the ready When a PR is ready for review label Mar 4, 2025

brian-dellabetta dismissed dsikka’s stale review via 87dcc4c March 5, 2025 22:36

brian-dellabetta force-pushed the bdellabe/lmeval-test-bugfixes branch from 87dcc4c to 1f2ce00 Compare March 5, 2025 22:40

brian-dellabetta requested a review from dsikka March 5, 2025 22:40

brian-dellabetta changed the title ~~attempts to fix lmeval failing tests~~ fixing reprdocubility of lmeval tests Mar 5, 2025

dsikka requested changes Mar 5, 2025

View reviewed changes

tests/lmeval/configs/vl_fp8_dynamic_per_token.yaml Show resolved Hide resolved

tests/lmeval/configs/vl_int8_w8a8_dynamic_per_token.yaml Show resolved Hide resolved

tests/lmeval/test_lmeval.py Outdated Show resolved Hide resolved

brian-dellabetta force-pushed the bdellabe/lmeval-test-bugfixes branch from 3429cc7 to b91cd7b Compare March 7, 2025 20:00

horheynm previously approved these changes Mar 7, 2025

View reviewed changes

rahul-tuli previously approved these changes Mar 7, 2025

View reviewed changes

dsikka enabled auto-merge (squash) March 7, 2025 20:10

dsikka requested changes Mar 7, 2025

View reviewed changes

brian-dellabetta changed the title ~~fixing reprdocubility of lmeval tests~~ fixing reproducibility of lmeval tests Mar 7, 2025

brian-dellabetta requested a review from dsikka March 7, 2025 20:25

brian-dellabetta dismissed stale reviews from rahul-tuli and horheynm via 2d2e220 March 7, 2025 20:27

brian-dellabetta force-pushed the bdellabe/lmeval-test-bugfixes branch from b91cd7b to 2d2e220 Compare March 7, 2025 20:27

kylesayrs previously approved these changes Mar 7, 2025

View reviewed changes

setup.py Show resolved Hide resolved

tests/lmeval/configs/vl_fp8_dynamic_per_token.yaml Outdated Show resolved Hide resolved

tests/lmeval/test_lmeval.py Show resolved Hide resolved

brian-dellabetta dismissed kylesayrs’s stale review via ba3410f March 10, 2025 21:06

aman2304 and others added 6 commits March 10, 2025 22:30

attempts to fix lmeval failing tests

941eb50

Signed-off-by: Brian Dellabetta <[email protected]>

pin random seed to resolve stochastic behavior

8da9de8

Signed-off-by: Brian Dellabetta <[email protected]>

brian-dellabetta and others added 4 commits March 10, 2025 22:30

only set torch random seed

b46690b

Signed-off-by: Brian Dellabetta <[email protected]>

change setting of random seed to configuration

53b79ea

Signed-off-by: Brian Dellabetta <[email protected]>

Update tests/lmeval/configs/vl_fp8_dynamic_per_token.yaml

8cc27c7

Co-authored-by: Kyle Sayers <[email protected]> Signed-off-by: Brian Dellabetta <[email protected]>

brian-dellabetta force-pushed the bdellabe/lmeval-test-bugfixes branch from fffabf0 to 341299f Compare March 10, 2025 22:30

Merge branch 'main' into bdellabe/lmeval-test-bugfixes

629a960

dsikka approved these changes Mar 11, 2025

View reviewed changes

setup.py Show resolved Hide resolved

dsikka added 2 commits March 10, 2025 23:44

Merge branch 'main' into bdellabe/lmeval-test-bugfixes

744575c

Merge branch 'main' into bdellabe/lmeval-test-bugfixes

5e5d8af

kylesayrs approved these changes Mar 11, 2025

View reviewed changes

dsikka merged commit c1fe865 into main Mar 11, 2025
8 checks passed

dsikka deleted the bdellabe/lmeval-test-bugfixes branch March 11, 2025 14:44

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fixing reproducibility of lmeval tests #1220

fixing reproducibility of lmeval tests #1220

brian-dellabetta commented Mar 4, 2025 •

edited

Loading

github-actions bot commented Mar 4, 2025

rahul-tuli left a comment

dsikka left a comment

fixing reproducibility of lmeval tests #1220

fixing reproducibility of lmeval tests #1220

Conversation

brian-dellabetta commented Mar 4, 2025 • edited Loading

github-actions bot commented Mar 4, 2025

rahul-tuli left a comment

Choose a reason for hiding this comment

dsikka left a comment

Choose a reason for hiding this comment

brian-dellabetta commented Mar 4, 2025 •

edited

Loading