-
Notifications
You must be signed in to change notification settings - Fork 497
Description
Dear authors,
Thanks for the great help. Currently we've tried to measure SDXL base model.
I've tried (win-vulkan)
sd-cli.exe -M convert -m sd_xl_diffusion_base_x1.safetensors -o sd_xl_diffusion_base_x1_q8_0.gguf -v --type q8_0
sd-cli.exe -M convert -m sd_xl_diffusion_base_x1.safetensors -o sd_xl_diffusion_base_x1_q4_0.gguf -v --type q4_0
I can convert SDXL into q8_0/q4_0 format by applying this.
And then I tried to compare fp16/q8_0/q4_0 models' time by:
sd-cli.exe -m sd_xl_diffusion_base_x1.safetensors -W 1024 -H 1024 -p "a lovely cat" --vae-tiling --vae-conv-direct --steps 7 -v
sd-cli.exe -m sd_xl_diffusion_base_x1_q8_0.gguf -W 1024 -H 1024 -p "a lovely cat" --vae-tiling --vae-conv-direct --type q8_0 -v --steps 7
sd-cli.exe -m sd_xl_diffusion_base_x1_q4_0.gguf -W 1024 -H 1024 -p "a lovely cat" --vae-tiling --vae-conv-direct --type q4_0 -v --steps 7
However, the time performances of the three models are almost the same.
FP16: 64 sec, q8_0: 68 sec, q4_0: 63 sec
Is this expected?
My platform info is as following:
Lunar Lake
CPU: Intel Ultra 268V
GPU: Intel R AI Boost
RAM: 32G
Thanks