-
Notifications
You must be signed in to change notification settings - Fork 71
Closed
Labels
enhancementNew feature or requestNew feature or requestquestionFurther information is requestedFurther information is requestedsvdquant
Description
during the ptq process, it only use 1 gpu, although i have set 4 gpus in the config file (i have 4 gpus installed).
it takes almost 52 hours to quantize my 8B flux model (the Step 3 in your readme) using a L40s gpu, is this normal?
can we accelerate the ptq process by quantizing blocks parallelly on different gpus?
thanks.
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or requestquestionFurther information is requestedFurther information is requestedsvdquant