-
-
Notifications
You must be signed in to change notification settings - Fork 411
kv-quant-scheme TurboQuant is not working with MoE #904
Copy link
Copy link
Open
Description
When I run
mlx_vlm.generate
--model "mlx-community/gemma-4-26B-A4B-it"
--prompt "Your prompt here"
--kv-bits 3.5
--kv-quant-scheme turboquant
I got this
pyenv-3/lib/python3.11/site-packages/mlx_vlm/turboquant.py", line 4318, in decode_attention
else keys_state.norms.shape[1]
^^^^^^^^^^^^^^^^
AttributeError: 'array' object has no attribute 'norms'
It seems that
--kv-quant-scheme TurboQuant
is unusable now
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels