Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

importing bitsandbytes results in "No module named 'triton.ops'" (gfx1101 + Fedora 40) #202

Closed
gui710 opened this issue Jan 24, 2025 · 8 comments

Comments

@gui710
Copy link

gui710 commented Jan 24, 2025

Hey, I "just" finished compiling everything a couple days ago and from what I've tried most seems to be working.
I tried manually importing bitsandbytes with import bitsandbytes, but it results in error.
Sorry if I'm missing something obvious and thank you in advance.

Python 3.11.9 (tags/v3.11.9-dirty:de54cf5be37, Jan 21 2025, 21:03:50) [GCC 14.2.1 20240912 (Red Hat 14.2.1-3)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import bitsandbytes
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/opt/rocm_sdk_612/lib/python3.11/site-packages/bitsandbytes/__init__.py", line 15, in <module>
    from .nn import modules
  File "/opt/rocm_sdk_612/lib/python3.11/site-packages/bitsandbytes/nn/__init__.py", line 17, in <module>
    from .triton_based_modules import (
  File "/opt/rocm_sdk_612/lib/python3.11/site-packages/bitsandbytes/nn/triton_based_modules.py", line 7, in <module>
    from bitsandbytes.triton.int8_matmul_mixed_dequantize import (
  File "/opt/rocm_sdk_612/lib/python3.11/site-packages/bitsandbytes/triton/int8_matmul_mixed_dequantize.py", line 12, in <module>
    from triton.ops.matmul_perf_model import early_config_prune, estimate_matmul_time
ModuleNotFoundError: No module named 'triton.ops'
>>> 
@lamikr
Copy link
Owner

lamikr commented Jan 24, 2025

Thanks for the feedback! I will try that also little later when I have more time. I am not sure whether I have actually tried to use the bitsandbytes by myself, so it's possible that something does not get installed.

@gui710
Copy link
Author

gui710 commented Jan 24, 2025

I know I've compiled a working version of BNB for this GPU on a previous Fedora 39 installation.
I'll see if I still have anything regarding BNB laying around, hopefully a previously compiled whl or something.

@lamikr
Copy link
Owner

lamikr commented Jan 25, 2025

I have tried a following test app:

from transformers import LlamaForCausalLM
from transformers import BitsAndBytesConfig

model = 'facebook/opt-350m'
model = LlamaForCausalLM.from_pretrained(model, quantization_config=BitsAndBytesConfig(load_in_8bit=True))

When I run it, it can import the BitsAndBytesConfig but then it will throw an error when executing the last line:

ModuleNotFoundError: No module named 'triton.ops'

Do you get same error? Or could you give me some other example to test? I remember seeing this "triton.ops" error also sometime earlier but I thought I had resolved it. It could be a some kind of dependency error (wrong version) between python libraries.

@lamikr
Copy link
Owner

lamikr commented Jan 25, 2025

I think I managed to hack it to work at least with my test code above. The problem seems to be that the pytorch 2.4 and bitsandbytes are not fully compatible with the new triton versions. I noticed 2 problems:

  1. hints.py from pytorch received TypeError('must be called with a dataclass type or instance') but failed to catch and handle that properly. I can fix that easily by adding proper error-catch to try-catch block that already exist there.
  File "/opt/rocm_sdk_612/lib/python3.11/site-packages/torch/_inductor/runtime/hints.py", line 36, in <module>
    attr_desc_fields = {f.name for f in fields(AttrsDescriptor)}
                                        ^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/rocm_sdk_612/lib/python3.11/dataclasses.py", line 1246, in fields
    raise TypeError('must be called with a dataclass type or instance') from None
TypeError: must be called with a dataclass type or instance
  1. bitsandbytes try to use triton.ops package which is not anymore available. The model from there seems to be moved to triton.kernels project. I was able to hack this to work temporarily by changing 2 bitsandbytes files:

/opt/rocm_sdk_612/lib/python3.11/site-packages/bitsandbytes/triton/int8_matmul_rowwise_dequantize.py

import torch

def int8_matmul_rowwise_dequantize(a, b, state_x, state_w, bias):
    return None

and

/opt/rocm_sdk_612/lib/python3.11/site-packages/bitsandbytes/triton/int8_matmul_mixed_dequantize.py

import torch

def int8_matmul_mixed_dequantize(a, b, state_x, state_w, bias):
    return None

I also tested with the latest upstream bitsandbytes version by building it from the multi-backend-refactor branch which has AMD support and verified that it has the same problem.

I will now try to to do a proper/better fix for these problems to rocm_sdk_builder by adding patches later on today.

@lamikr
Copy link
Owner

lamikr commented Jan 25, 2025

And I created a bug also to upstream bitsandbytes about this error:

bitsandbytes-foundation/bitsandbytes#1492

lamikr added a commit that referenced this issue Jan 26, 2025
 File "/opt/rocm_sdk_612/lib/python3.11/dataclasses.py", line 1246, in fields
    raise TypeError('must be called with a dataclass type or instance') from None
TypeError: must be called with a dataclass type or instance

fixes: #202

Signed-off-by: Mika Laitio <[email protected]>
lamikr added a commit that referenced this issue Jan 26, 2025
- newest triton versions does not anymore have the
  triton.ops.matmul_perf_model, so this checks it's existence
  in addition of checking the triton existence

fixes: #202

Signed-off-by: Mika Laitio <[email protected]>
@lamikr lamikr closed this as completed in 967bdcb Jan 26, 2025
lamikr added a commit that referenced this issue Jan 26, 2025
- newest triton versions does not anymore have the
  triton.ops.matmul_perf_model, so this checks it's existence
  in addition of checking the triton existence

fixes: #202

Signed-off-by: Mika Laitio <[email protected]>
@lamikr
Copy link
Owner

lamikr commented Jan 26, 2025

I pushed the fix in if you want to test. You should get it by running

./babs.sh -up
./babs.sh -b

@lamikr lamikr reopened this Jan 26, 2025
@gui710
Copy link
Author

gui710 commented Jan 27, 2025

Hey, sorry for the late reply.
Upon checking it seems your fixed solved the error issue.
I haven't been able to give it a proper try yet, maybe later tonight.
Thank you very much for giving it a look so fast!

Best Regards

@lamikr
Copy link
Owner

lamikr commented Jan 27, 2025

No problem, I will close this now. And thanks for pointing this out, if you notice some other issues, let me know.

@lamikr lamikr closed this as completed Jan 27, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants