Skip to content

[Bug]: int8 w8a8 quantization failed with oneshotย #1880

@Inoryu0624

Description

@Inoryu0624

โš™๏ธ Your current environment

The output of python collect_env.py
Your output of `python collect_env.py` here

๐Ÿ› Describe the bug

when using llm-compressor for int8 w8a8 quantiazation, an error occured. I used pdb to trace but it cannot show the actual code for the error below:

Traceback (most recent call last):
File "/usr/local/lib/python3.11/site-packages/llmcompressor/pipelines/sequential/helpers.py", line 73, in forward
outputs = forward_fn(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "", line 13, in forward
File "Shennong3Model_8770701147069_autowrapped", line 61, in wrapped_1
ValueError: too many values to unpack (expected 2)

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "/data/ayakawang/ocr-010/quant/quant_nv.py", line 77, in
oneshot(
File "/usr/local/lib/python3.11/site-packages/compressed_tensors/utils/helpers.py", line 194, in wrapped
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/llmcompressor/transformers/finetune/text_generation.py", line 33, in oneshot
oneshot(**kwargs)
File "/usr/local/lib/python3.11/site-packages/llmcompressor/entrypoints/oneshot.py", line 319, in oneshot
one_shot()
File "/usr/local/lib/python3.11/site-packages/llmcompressor/entrypoints/oneshot.py", line 149, in call
self.apply_recipe_modifiers(
File "/usr/local/lib/python3.11/site-packages/llmcompressor/entrypoints/oneshot.py", line 192, in apply_recipe_modifiers
pipeline(
File "/usr/local/lib/python3.11/site-packages/llmcompressor/pipelines/independent/pipeline.py", line 45, in call
pipeline(model, dataloader, dataset_args)
File "/usr/local/lib/python3.11/site-packages/llmcompressor/pipelines/sequential/pipeline.py", line 104, in call
subgraph.forward(model, **inputs)
File "/usr/local/lib/python3.11/site-packages/llmcompressor/pipelines/sequential/helpers.py", line 75, in forward
raise RuntimeError(
RuntimeError: Raised an exception during execution of the following code:
1
2 torch.fx._symbolic_trace.wrap("v3_modeling_mllm_shennong3_wrapped_1")
3 torch.fx._symbolic_trace.wrap("v3_modeling_mllm_shennong3_wrapped_5")
4 torch.fx._symbolic_trace.wrap("v3_modeling_mllm_shennong3_wrapped_0")
5 torch.fx._symbolic_trace.wrap("v3_modeling_mllm_shennong3_wrapped_6")
6 torch.fx._symbolic_trace.wrap("v3_modeling_mllm_shennong3_wrapped_3")
7 torch.fx._symbolic_trace.wrap("v3_modeling_mllm_shennong3_wrapped_8")
8 torch.fx._symbolic_trace.wrap("v3_modeling_mllm_shennong3_wrapped_4")
9 torch.fx._symbolic_trace.wrap("v3_modeling_mllm_shennong3_wrapped_7")
10 torch.fx._symbolic_trace.wrap("v3_modeling_mllm_shennong3_wrapped_9")
11
12 def forward(self, input_ids : torch.Tensor, input_images : torch.Tensor):
13 wrapped_1 = v3_modeling_mllm_shennong3_wrapped_1(input_ids, None)
14 wrapped_5 = v3_modeling_mllm_shennong3_wrapped_5(input_ids, None)
15 wrapped_0 = v3_modeling_mllm_shennong3_wrapped_0(input_ids, input_images, None, None); input_images = None
16 getitem_5 = wrapped_1[0]; getitem_5 = None
17 getitem_6 = wrapped_1[1]
18 getitem_7 = wrapped_1[2]; wrapped_1 = None
19 getitem_13 = wrapped_5[0]; wrapped_5 = None
20 getitem = wrapped_0[0]; getitem = None
21 getitem_1 = wrapped_0[1]; getitem_1 = None
22 getitem_2 = wrapped_0[2]
23 getitem_3 = wrapped_0[3]
24 getitem_4 = wrapped_0[4]; wrapped_0 = getitem_4 = None
25 wrapped_6 = v3_modeling_mllm_shennong3_wrapped_6(None, getitem_6, False)
26 wrapped_3 = v3_modeling_mllm_shennong3_wrapped_3(None, 0, getitem_7, False)
27 wrapped_8 = v3_modeling_mllm_shennong3_wrapped_8(getitem_2, getitem_13, getitem_3, input_ids); getitem_2 = getitem_3 = None
28 getitem_14 = wrapped_6[0]; wrapped_6 = getitem_14 = None
29 getitem_8 = wrapped_3[0]
30 getitem_9 = wrapped_3[1]
31 getitem_10 = wrapped_3[2]; wrapped_3 = None
32 getitem_16 = wrapped_8[0]; getitem_16 = None
33 getitem_17 = wrapped_8[1]; getitem_17 = None
34 getitem_18 = wrapped_8[2]; getitem_18 = None
35 getitem_19 = wrapped_8[3]
36 getitem_20 = wrapped_8[4]; getitem_20 = None
37 getitem_21 = wrapped_8[5]; getitem_21 = None
38 getitem_22 = wrapped_8[6]; wrapped_8 = getitem_22 = None
39 wrapped_4 = v3_modeling_mllm_shennong3_wrapped_4(input_ids, None, getitem_9, None, getitem_7); input_ids = None
40 wrapped_7 = v3_modeling_mllm_shennong3_wrapped_7(None, getitem_6, getitem_13, False, getitem_9, getitem_7); getitem_6 = getitem_13 = getitem_9 = getitem_7 = None
41 wrapped_9 = v3_modeling_mllm_shennong3_wrapped_9(None, getitem_19, False)
42 getitem_11 = wrapped_4[0]; getitem_11 = None
43 getitem_12 = wrapped_4[1]; wrapped_4 = None
44 getitem_15 = wrapped_7[0]; wrapped_7 = None
45 getitem_23 = wrapped_9[0]; wrapped_9 = None
46 model_layers_0 = getattr(self.model.layers, "0")(getitem_19, attention_mask = getitem_15, position_ids = getitem_12, past_key_value = getitem_8, output_attentions = False, use_cache = False); getitem_19 = None
47 return {'getitem_8': getitem_8, 'getitem_10': getitem_10, 'getitem_12': getitem_12, 'getitem_15': getitem_15, 'getitem_23': getitem_23, 'model_layers_0': model_layers_0}
48


I traced the input of args and kwargs. args is the model architecture, kwargs is two inputs of the model. When using pdb to trace the forward_fn, I cannot trace the actual code for the error, how could I solve this problem?

### ๐Ÿ› ๏ธ Steps to reproduce

_No response_

Metadata

Metadata

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions