Enable dequant+matmul 8bit path for Intel CPU and XPU #1484

jiqing-feng · 2025-01-23T09:30:53Z

Hi @Titus-von-Koeller @matthewdouglas . This feature enables dequant 8bit weight and using float matmul. It speed-up the lora finetune for 3x on XPU and 2x on CPU by the lora finetune script on llama3-8b by the command python olora_finetuning.py --base_model alokabhishek/Meta-Llama-3-8B-Instruct-bnb-8bit --init_lora_weights gaussian --seed 42 --torch_dtype bfloat16 --device_map cpu.

All tests in transformers have been passed, please review this PR. Thanks!

Signed-off-by: jiqing-feng <[email protected]>

github-actions · 2025-01-28T11:01:24Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

matthewdouglas · 2025-01-28T16:30:24Z

On the mainline branch with the int8 refactoring that was done for v0.45.0, Linear8bitLt was simplified a bit. One of the optimizations made there was to only do the row-wise quantization for inference, and still do the "double quant" with row/col for training.

But with that said the decomposition into separate int8 and fp16 matmuls has some overhead and I assume this is where the unsafe operations come from, particularly when threshold!=0.

Happy to merge; we can revisit doing int8 computation in the future if needed.

jiqing-feng added 2 commits January 23, 2025 15:08

new matmul8bit

b02b757

Signed-off-by: jiqing-feng <[email protected]>

fix cxb

f072403

Signed-off-by: jiqing-feng <[email protected]>

matthewdouglas self-requested a review January 28, 2025 16:30

matthewdouglas approved these changes Jan 28, 2025

View reviewed changes

matthewdouglas added the Intel Integration label Jan 28, 2025

matthewdouglas merged commit 307fbd5 into bitsandbytes-foundation:multi-backend-refactor Jan 28, 2025
2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enable dequant+matmul 8bit path for Intel CPU and XPU #1484

Enable dequant+matmul 8bit path for Intel CPU and XPU #1484

jiqing-feng commented Jan 23, 2025 •

edited

Loading

github-actions bot commented Jan 28, 2025

matthewdouglas commented Jan 28, 2025

Enable dequant+matmul 8bit path for Intel CPU and XPU #1484

Enable dequant+matmul 8bit path for Intel CPU and XPU #1484

Conversation

jiqing-feng commented Jan 23, 2025 • edited Loading

github-actions bot commented Jan 28, 2025

matthewdouglas commented Jan 28, 2025

jiqing-feng commented Jan 23, 2025 •

edited

Loading