[Bugfix] Fix saving offloaded state dict #172

kylesayrs · 2024-09-12T19:14:06Z

Purpose

Fix bug with saving the entire state dict of an offloaded model, even if it does not have a compressor
Properly infer sparsity and quantization of offloaded models

Changes

Load the entire offloaded state dict before checking for a compressor and before inferring global sparsity
Remove explicit override of save_safetensors kwarg, since the original save_pretrained function already defaults its value to True

Testing

Added test which throws error when trying to save an offloaded model without compression. This test fails on main but passes on this branch

Main

FAILED tests/llmcompressor/transformers/sparsification/test_compress_tensor_utils.py::test_model_reload[True-torch_dtype3-False-cpu] - NotImplementedError: Cannot copy out of meta tensor; no data!
FAILED tests/llmcompressor/transformers/sparsification/test_compress_tensor_utils.py::test_model_reload[True-torch_dtype5-False-cuda:0] - NotImplementedError: Cannot copy out of meta tensor; no data!
FAILED tests/llmcompressor/transformers/sparsification/test_compress_tensor_utils.py::test_model_reload[True-torch_dtype6-True-cuda:0] - NotImplementedError: Cannot copy out of meta tensor; no data!
FAILED tests/llmcompressor/transformers/sparsification/test_compress_tensor_utils.py::test_model_reload[True-torch_dtype7-True-cuda:0] - NotImplementedError: Cannot copy out of meta tensor; no data!
======================================= 4 failed, 4 passed, 12 deselected in 9.03s =======================================

This branch

============================================ 8 passed, 12 deselected in 8.89s ============================================

github-actions · 2024-09-12T19:14:19Z

👋 Hi! Thank you for contributing to llm-compressor. Please add the ready label when the PR is ready for review.

kylesayrs · 2024-09-17T22:02:09Z

I'm pretty sure this should be merged, but I want to fully test checkpoint loading before this is merged in

…llm-project/llm-compressor into kylesayrs/fix-offloaded-saving

dsikka

LGTM. Just one confirmation question

src/llmcompressor/transformers/sparsification/compressed_tensors_utils.py

dsikka · 2024-09-30T19:32:20Z

@kylesayrs can you resolve the conflict

…ving

dsikka · 2024-10-01T13:23:48Z

@kylesayrs Seems like some of the errors for the sparsification tests are new?

kylesayrs · 2024-10-01T17:34:51Z

I addressed the cuda errors by separating out GPU-dependent tests
There are other, new failures which are likely caused by upstream CT changes, but they are unrelated and outside of the scope of this PR, which only aims to address the newly added tests test_model_reload and to precursor other test fixes

tests/llmcompressor/transformers/sparsification/test_compress_tensor_utils.py

rahul-tuli

load offload state dict

add7d32

rahul-tuli previously approved these changes Sep 12, 2024

View reviewed changes

kylesayrs added 2 commits September 13, 2024 11:46

Merge branch 'main' into kylesayrs/fix-offloaded-saving

40a0425

Merge branch 'main' into kylesayrs/fix-offloaded-saving

64c3834

kylesayrs mentioned this pull request Sep 13, 2024

[Bugfix] Prepare KD Models when Saving #174

Merged

kylesayrs marked this pull request as draft September 13, 2024 21:42

kylesayrs changed the title ~~Fix saving offloaded state dict~~ [bugfix] Fix saving offloaded state dict Sep 24, 2024

add test

570189d

kylesayrs dismissed rahul-tuli’s stale review via 570189d September 24, 2024 19:18

remove merge duplication

f52d685

kylesayrs changed the title ~~[bugfix] Fix saving offloaded state dict~~ [Bugfix] Fix saving offloaded state dict Sep 24, 2024

prepare to fix tie_word_embeddings

388abd1

kylesayrs self-assigned this Sep 24, 2024

Kyle Sayers and others added 5 commits September 25, 2024 00:04

add full tests

cc62178

comment out failing tests, point to next pr

0b990a7

Merge branch 'main' into kylesayrs/fix-offloaded-saving

8c76a1d

apply style

201d482

Merge branch 'kylesayrs/fix-offloaded-saving' of https://github.com/v…

71b16b2

…llm-project/llm-compressor into kylesayrs/fix-offloaded-saving

kylesayrs marked this pull request as ready for review September 25, 2024 18:54

kylesayrs mentioned this pull request Sep 25, 2024

[Bugfix] Workaround tied tensors bug #659

Merged

kylesayrs requested review from rahul-tuli, mgoin, dsikka and horheynm September 25, 2024 21:27

Remove failing tests

de84652

horheynm previously approved these changes Sep 26, 2024

View reviewed changes

dsikka previously approved these changes Sep 27, 2024

View reviewed changes

src/llmcompressor/transformers/sparsification/compressed_tensors_utils.py Show resolved Hide resolved

explicitly set safe_serialization

00791b6

kylesayrs dismissed stale reviews from dsikka and horheynm via 00791b6 September 27, 2024 21:13

kylesayrs requested review from horheynm and dsikka September 27, 2024 21:14

Merge branch 'main' into kylesayrs/fix-offloaded-saving

4496220

Merge remote-tracking branch 'origin' into kylesayrs/fix-offloaded-sa…

11ddfb0

…ving

separate out gpu tests, apply style

7a05e61

dsikka reviewed Oct 1, 2024

View reviewed changes

tests/llmcompressor/transformers/sparsification/test_compress_tensor_utils.py Show resolved Hide resolved

dsikka approved these changes Oct 1, 2024

View reviewed changes

dsikka and others added 2 commits October 2, 2024 19:03

Merge branch 'main' into kylesayrs/fix-offloaded-saving

4a8faca

Merge branch 'main' into kylesayrs/fix-offloaded-saving

bf192df

rahul-tuli approved these changes Oct 4, 2024

View reviewed changes

Merge branch 'main' into kylesayrs/fix-offloaded-saving

89fdb3e

mgoin approved these changes Oct 4, 2024

View reviewed changes

mgoin merged commit 91eed2f into main Oct 4, 2024
6 of 7 checks passed

mgoin deleted the kylesayrs/fix-offloaded-saving branch October 4, 2024 18:14

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bugfix] Fix saving offloaded state dict #172

[Bugfix] Fix saving offloaded state dict #172

kylesayrs commented Sep 12, 2024 •

edited

Loading

github-actions bot commented Sep 12, 2024

kylesayrs commented Sep 17, 2024

dsikka left a comment

dsikka commented Sep 30, 2024

dsikka commented Oct 1, 2024 •

edited

Loading

kylesayrs commented Oct 1, 2024 •

edited

Loading

rahul-tuli left a comment

[Bugfix] Fix saving offloaded state dict #172

[Bugfix] Fix saving offloaded state dict #172

Conversation

kylesayrs commented Sep 12, 2024 • edited Loading

Purpose

Changes

Testing

github-actions bot commented Sep 12, 2024

kylesayrs commented Sep 17, 2024

dsikka left a comment

Choose a reason for hiding this comment

dsikka commented Sep 30, 2024

dsikka commented Oct 1, 2024 • edited Loading

kylesayrs commented Oct 1, 2024 • edited Loading

rahul-tuli left a comment

Choose a reason for hiding this comment

kylesayrs commented Sep 12, 2024 •

edited

Loading

dsikka commented Oct 1, 2024 •

edited

Loading

kylesayrs commented Oct 1, 2024 •

edited

Loading