Skip to content

Commit f525499

Browse files
authored
Update NVFP4 default observer (#493)
* Update NVFP4 default observer - As opposed to use the running minmax, use the static minmax. - Seems to be better for Qwen3 VL MoE - Need to rerun on Llama3 8b * Update quant_scheme.py
1 parent df6fd15 commit f525499

File tree

1 file changed

+2
-0
lines changed

1 file changed

+2
-0
lines changed

src/compressed_tensors/quantization/quant_scheme.py

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -172,6 +172,7 @@ def is_preset_scheme(name: str) -> bool:
172172
symmetric=True,
173173
dynamic=False,
174174
group_size=16,
175+
observer="static_minmax",
175176
),
176177
input_activations=QuantizationArgs(
177178
num_bits=4,
@@ -180,6 +181,7 @@ def is_preset_scheme(name: str) -> bool:
180181
symmetric=True,
181182
dynamic=DynamicType.LOCAL,
182183
group_size=16,
184+
observer="static_minmax",
183185
),
184186
)
185187

0 commit comments

Comments
 (0)