-
Notifications
You must be signed in to change notification settings - Fork 500
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix type error in extract_lora.py
: SVD only supports float32
#510
base: main
Are you sure you want to change the base?
Conversation
extract_lora.py
: SVD only supports float32
Thank you - this worked perfectly ; Quick update: This works great, but on CPU I get Blue screen of Death* after running it (overheat). Compute is too high. With Llamacpp + Quantizing -> I set the max number of "cores" to use, this fixes the issue. Would be great if Mergekit has this option - if it does , please advise . I find slight differences between CPU / GPU "math" , with CPU preferred , even if it takes longer. This would be great especially on "mergekit-yaml" ; which I need to use "--cuda" - otherwise "too many cores" activate on CPU => Blue screen of death*. (the other option: pause build ... cool ... continue... cool... continue) Example: When making MOEs I usually do these in float32 because I have found the MOEs operate between when source and "master gguf" are both in float 32 regardless of MOE source models precision. Also; is there any doc(s) about LORA Extract's new options? I found the "save" and a few others but the tech papers about " --distribute-scale " && " --sv-epsilon FLOAT " are ahh... really hard to "gauge" if I should use/set them.
Thanks - Mergekit is fantastic. 1000+ models built and counting with it... |
I have read the CLA Document and I hereby sign the CLA 1 out of 2 committers have signed the CLA. |
Running
extract_lora.py
on any model that does not usefloat32
as its default tensor type currently results in aRuntimeError
. This is a significant limitation, as many recent models default tobfloat16
orfloat16
(half precision). The error arises from the use of thetorch.linalg.svd
API, which only supportstorch.float32
. This pull request (PR) addresses the issue by converting tensors tofloat32
(full precision) before executingtorch.linalg.svd
, and then converting them back to their original data type afterwards.