-
Notifications
You must be signed in to change notification settings - Fork 652
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove separate NO_CUBLASLT build. #1103
Remove separate NO_CUBLASLT build. #1103
Conversation
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
// TODO: Check overhead. Maybe not worth it; just check in Python lib once, | ||
// and avoid calling lib functions w/o support for them. | ||
// TODO: Address GTX 1660, any other 7.5 devices maybe not supported. | ||
inline bool igemmlt_supported() { | ||
int device; | ||
int ccMajor; | ||
|
||
CUDA_CHECK_RETURN(cudaGetDevice(&device)); | ||
CUDA_CHECK_RETURN(cudaDeviceGetAttribute(&ccMajor, cudaDevAttrComputeCapabilityMajor, device)); | ||
|
||
if (ccMajor >= 8) | ||
return true; | ||
|
||
if (ccMajor < 7) | ||
return false; | ||
|
||
int ccMinor; | ||
CUDA_CHECK_RETURN(cudaDeviceGetAttribute(&ccMinor, cudaDevAttrComputeCapabilityMinor, device)); | ||
|
||
return ccMinor >= 5; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using this as more of a sanity check right now. I'd expect we wouldn't be calling transform
, spmm_coo
, or igemmlt
with devices that don't support it, but I haven't verified this. In particular the spmm_coo
function is one that I am not so sure about.
If you want, I have a GTX 1070 (under WSL2, works surprisingly well) I can test on:
|
Hey @matthewdouglas @akx, Tim mentioned that it's probably not safe to remove this. What's your opinion on this? How can be certain what's what? Currently, I'm not sure how to best proceed. |
Did Tim say why it's "probably not safe"? Do we know of an actual situation where cublaslt isn't available? Is such a situation something we want to support? |
I'm curious too, but I think there might also be just a naming issue here since the cublasLt has shipped with the CUDA Toolkit since v10.1. It could have been placed in some unusual spots but by the time toolkit 11.0 comes around it's not an issue and we should always be able to link to it. PyTorch binaries ship with it. And if I'm not mistaken, libcublas.so itself depends on libcublaslt.so these days. The main differentiator here is support for int8 tensor cores (e.g. the check for compute capability >= 7.5). So we would have to make sure to not call Some places where cublasLt is used:
Separately, I believe I remember reading somewhere that there would be intent to actually deprecate the int8 matmul path that does not use tensor cores too (F.igemm, MatMul8bit, and also F.vectorwise_quant, F.vectorwise_mm_dequant). |
I'll try to get in touch with Tim to get more info from him and relay the new info you provided. Unfortunately, he didn't give any reasoning at the time. He's quite unavailable atm, so it might take a few days. Thanks @matthewdouglas for this thorough and knowledgable analysis, this was once again very helpful! |
9b72679
to
7800734
Compare
Superseded by #1401. |
This PR removes the build option
NO_CUBLASLT
. It additionally removes the runtime check to load the separatenocublaslt
variants of the library.Reasoning:
So far I've only tested this on RTX 3060. I do have access to a machine with a GTX 1660, so I'll try to test on that too.