Enable double quant on Intel CPU and XPU #1472

jiqing-feng · 2025-01-09T06:47:36Z

Enable double quant on 4bit implementation for Intel CPU and XPU.

jiqing-feng · 2025-01-09T06:59:53Z

Hi @Titus-von-Koeller . I enabled the double quant on 4bit implementation for Intel CPU/XPU, and checked the results and performance. Where should I add a test about it? Thanks!

Signed-off-by: jiqing-feng <[email protected]>

Titus-von-Koeller · 2025-01-16T17:06:40Z

Thanks @jiqing-feng!

We'll get back to you about this soon.

jiqing-feng · 2025-01-20T05:28:52Z

Hi @Titus-von-Koeller . I made some new changes on this PR, it fixes the 4bit data format and align with cuda. For more details:

In cuda, the 4bit value will be packed into uint8 tensor like [1, 2] will be pack to18, because 1 = 0b0001, 2 = 0b0010, and 0b00010010 is 18. We literally put the first value into the left position.
In cpu and xpu, the value will be 0b00100001 = 33, because we put the first value into the right position to be compatible with our ipex API.

In this PR, we kept the 4bit format as the same as cuda on cpu/xpu and converted the format to ipex compatible format only when initializing ipex linear.

With this change, we can run a quantized model like hugging-quants/Meta-Llama-3.1-8B-Instruct-BNB-NF4

Please let me know if I didn't make it clear. Besides, I have passed all tests in transformers. Thanks!

jiqing-feng · 2025-01-20T07:51:15Z

XPU has performance issue, I will figure it out.

Signed-off-by: jiqing-feng <[email protected]>

jiqing-feng · 2025-01-21T01:59:31Z

XPU has performance issue, I will figure it out.

The XPU issue has been fixed.
Hi @Titus-von-Koeller , the PR is prepared to be reviewed. It passed all transformers tests and I also verified it on some generation and lora finetune tasks.

matthewdouglas · 2025-01-21T02:16:28Z

Thanks @jiqing-feng! Really appreciate the effort to keep the serialized format compatible :)

I'll take care of reviewing more closely through this week but it looks good at first pass!

Signed-off-by: jiqing-feng <[email protected]>

jiqing-feng marked this pull request as draft January 9, 2025 11:49

jiqing-feng added 4 commits January 9, 2025 14:25

fix dequant 8bit

1f30627

Signed-off-by: jiqing-feng <[email protected]>

support double quant on intel cpu and xpu

47589cd

Signed-off-by: jiqing-feng <[email protected]>

fix format

cafe58f

Signed-off-by: jiqing-feng <[email protected]>

fix shape

96f4ac8

Signed-off-by: jiqing-feng <[email protected]>

jiqing-feng marked this pull request as ready for review January 10, 2025 02:32

jiqing-feng marked this pull request as draft January 20, 2025 07:50

jiqing-feng added 3 commits January 20, 2025 12:53

fix 4bit format

5f78858

Signed-off-by: jiqing-feng <[email protected]>

fix device error for xpu

629af94

Signed-off-by: jiqing-feng <[email protected]>

fix 4bit tensor shape

b7ca20b

Signed-off-by: jiqing-feng <[email protected]>

jiqing-feng marked this pull request as ready for review January 21, 2025 01:45

jiqing-feng marked this pull request as draft January 21, 2025 01:48

jiqing-feng marked this pull request as ready for review January 21, 2025 01:58

matthewdouglas self-requested a review January 21, 2025 02:11

fix nf4 xpu finetune

337c3f4

Signed-off-by: jiqing-feng <[email protected]>

matthewdouglas approved these changes Jan 22, 2025

View reviewed changes

matthewdouglas merged commit f6025bc into bitsandbytes-foundation:multi-backend-refactor Jan 22, 2025
2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enable double quant on Intel CPU and XPU #1472

Enable double quant on Intel CPU and XPU #1472

jiqing-feng commented Jan 9, 2025

jiqing-feng commented Jan 9, 2025 •

edited

Loading

Titus-von-Koeller commented Jan 16, 2025

jiqing-feng commented Jan 20, 2025 •

edited

Loading

jiqing-feng commented Jan 20, 2025 •

edited

Loading

jiqing-feng commented Jan 21, 2025 •

edited

Loading

matthewdouglas commented Jan 21, 2025

Enable double quant on Intel CPU and XPU #1472

Enable double quant on Intel CPU and XPU #1472

Conversation

jiqing-feng commented Jan 9, 2025

jiqing-feng commented Jan 9, 2025 • edited Loading

Titus-von-Koeller commented Jan 16, 2025

jiqing-feng commented Jan 20, 2025 • edited Loading

jiqing-feng commented Jan 20, 2025 • edited Loading

jiqing-feng commented Jan 21, 2025 • edited Loading

matthewdouglas commented Jan 21, 2025

jiqing-feng commented Jan 9, 2025 •

edited

Loading

jiqing-feng commented Jan 20, 2025 •

edited

Loading

jiqing-feng commented Jan 20, 2025 •

edited

Loading

jiqing-feng commented Jan 21, 2025 •

edited

Loading