Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bitsandbytes searching *cuda*so only in /usr/local and does not support other paths #1140

Open
Parvez-Khan-1 opened this issue Mar 21, 2024 · 4 comments

Comments

@Parvez-Khan-1
Copy link

Parvez-Khan-1 commented Mar 21, 2024

System Info

Operating System: Oracle Linux 7.9
Python Version: 3.10.10
GPU: NVIDIA A100
CUDA: 12.3
bitandbytes: 0.42.0

Reproduction

To reproduce this issue:

  • You should have cuda installed somewhere else other than /usr/local/
  • export below environment variable
export PATH=/some_dir/cuda-12.3/bin:$PATH
export LD_LIBRARY_PATH=/some_dir/cuda-12.3/lib64:$LD_LIBRARY_PATH
export CUDA_HOME=/some_dir/cuda-12.3/         OR      export CUDA_PATH=/some_dir/cuda-12.3/
python -m bitsandbytes

Output:

python -m bitsandbytes
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++ BUG REPORT INFORMATION ++++++++++++++++++
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

Traceback (most recent call last):
  File "/sys_apps_01/python310/lib/python3.10/runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/sys_apps_01/python310/lib/python3.10/runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "/dir1/venv/lib/python3.10/site-packages/bitsandbytes/__main__.py", line 96, in <module>
    generate_bug_report_information()
  File "/dir1/venv/lib/python3.10/site-packages/bitsandbytes/__main__.py", line 54, in generate_bug_report_information
    paths = find_file_recursive('/usr/local', '*cuda*so')
  File "/dir1/venv/lib/python3.10/site-packages/bitsandbytes/__main__.py", line 37, in find_file_recursive
    raise RuntimeError('Something when wrong when trying to find file. Maybe you do not have a linux system?')
RuntimeError: Something when wrong when trying to find file. Maybe you do not have a linux system?

Expected behavior

Ideally, it should also look into CUDA_HOME or CUDA_PATH (IF SET) and try to find the *cuda*so files there recursively other than /usr/local

@Titus-von-Koeller
Copy link
Collaborator

Titus-von-Koeller commented Mar 27, 2024

cc @akx @matthewdouglas

We should take this into account for one of our cuda_setup refactors.

Linking to #918

Thanks a lot for raising this @Parvez-Khan-1 ! We'll look into it and report back, once we got around to it.

@matthewdouglas
Copy link
Member

I do agree - we would ideally want it to find the CUDA libraries even if they're in a non-standard path. Note that this command is just for diagnostics though, to aid in helping a user determine how to set LD_LIBRARY_PATH. That could use some reworking as most of the time the libraries are going to be shipped with the PyTorch binaries.

That said, this issue might not reproduce the same way in 0.43.0 (example here is 0.42.0).

Relates to #1126

@Parvez-Khan-1
Copy link
Author

@matthewdouglas Seems like there is support for CUDA_PATHS variable in 0.43.0

@fantasy-fish
Copy link

I am using the latest verison, but still have the same issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants