-
-
Notifications
You must be signed in to change notification settings - Fork 4k
ROCM support #3279
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
ROCM support #3279
Conversation
|
windows support may also be possible but i would need some help testing this as i do not have a windows machine |
|
docs changes: diff --git a/get-started/installing-+-updating/pip-install.md b/get-started/installing-+-updating/pip-install.md
index c1f0975..5f66dbf 100644
--- a/get-started/installing-+-updating/pip-install.md
+++ b/get-started/installing-+-updating/pip-install.md
@@ -24,6 +24,16 @@ pip uninstall unsloth unsloth_zoo -y && pip install --no-deps git+https://github
If you're installing Unsloth in Jupyter, Colab, or other notebooks, be sure to prefix the command with `!`. This isn't necessary when using a terminal
+**To install Unsloth on AMD GPUs:**
+
+{% hint style="info" %}
+You can safely ignore errors about CUDA not being linked properly if you are installing Unsloth on AMD GPUs.
+{% endhint %}
+
+```bash
+pip install "unsloth[rocm64-torch280]"
+```
+
## Uninstall + Reinstall
If you're still encountering dependency issues with Unsloth, many users have resolved them by forcing uninstalling and reinstalling Unsloth:diff --git a/get-started/beginner-start-here/unsloth-requirements.md b/get-started/beginner-start-here/unsloth-requirements.md
index 793bd63..b5f5429 100644
--- a/get-started/beginner-start-here/unsloth-requirements.md
+++ b/get-started/beginner-start-here/unsloth-requirements.md
@@ -8,7 +8,7 @@ description: Here are Unsloth's requirements including system and GPU VRAM requi
* **Operating System**: Works on Linux and Windows.
* Supports NVIDIA GPUs since 2018+ including [Blackwell RTX 50](../../basics/training-llms-with-blackwell-rtx-50-series-and-unsloth) series. Minimum CUDA Capability 7.0 (V100, T4, Titan V, RTX 20, 30, 40, A100, H100, L40 etc) [Check your GPU!](https://developer.nvidia.com/cuda-gpus) GTX 1070, 1080 works, but is slow.
-* Unsloth should work on [AMD](https://github.com/unslothai/unsloth/pull/2520) and [Intel](https://github.com/unslothai/unsloth/pull/2621) GPUs! Apple/Silicon/MLX is in the works.
+* Unsloth should work on [AMD](../installing-+-updating/pip-install#amd-installation) and [Intel](https://github.com/unslothai/unsloth/pull/2621) GPUs! Apple/Silicon/MLX is in the works.
* If you have different versions of torch, transformers etc., `pip install unsloth` will automatically install all the latest versions of those libraries so you don't need to worry about version compatibility.
* Your device must have `xformers`, `torch`, `BitsandBytes` and `triton` support.
|
|
seems like 4bit exporting has some issues as 64 blocksize is not supported with rocm (ROCm/bitsandbytes#10), it is possible to have 64 blocksize though depending on warp size so i will look into submitting a pr to bitsandbytes |
|
i have found a likely solution, if it works maybe i can switch over the builds to my fork until its merged in so 4bit works |
|
marking as draft until i get this issue fixed as it is fairly major |
|
pr created: bitsandbytes-foundation/bitsandbytes#1748 |
|
should work now, testing changes |
|
works |
|
Works great on AMD MI100. I added this to my vllm Dockerfile and it just worked. RUN git clone --recurse https://github.com/ROCm/bitsandbytes && cd bitsandbytes && git checkout rocm_enabled_multi_backend && pip install -r requirements-dev.txt && cmake -DCOMPUTE_BACKEND=hip -S . && make -j && pip install .
RUN git clone https://github.com/electron271/unsloth-rocm.git && cd unsloth-rocm && pip install .
RUN pip install unsloth_zooThanks |
great to hear! you also shouldn't need to use the rocm fork of bitsandbytes (afaik), this branch will install rocm supported bitsandbytes as a dependency and if you want to manually install it was merged into main so you can use main bitsandbytes |
|
I ran |
4bit is broken on CDNA gpus as they do not support 64 block size, i am unaware if there is a solution or not |
|
Hi @electron271 , glad to see this fabulous contribution for amd GPU. Let me help on verifying on more kinds of devices and hope to collaborate on this. |
|
I like the way to provide our end user the fresh prebuilt bnb binary directly in the patch. Somehow this does not work in some environment |
Co-authored-by: billishyahao <[email protected]>
i think a dockerfile would be beneficial for systems that dont support this. this error is caused by having a out of date system, the minimally usable version of gcc is GCC 13.2, released July 27, 2023. i will note that i had a lot of issues with dockerized rocm when i was trying to get unsloth working on rocm initially, so i'm not sure if i am able to help with it. |
|
the upstream bitsandbytes pr should hopefully be able to be merged soon |
|
Hi @electron271 With that said, the official bitsandbytes wheels we build and will eventually publish are compatible with Ubuntu 22.04 (and other supported systems with glibc>=2.24). I am going to go ahead and merge that PR on bitsandbytes soon; we'll drop the ROCm 6.1 build and keep 6.2/6.3/6.4/7.0. We still need to add the RDNA4/CDNA4 build targets (RX 9070/9060, MI350X/MI355X), and need to keep in mind that while this can enable blocksize 64 on RDNA (consumer) it won't for CDNA (datacenter). |
done, my bitsandbytes builds are temporarily broken though as i reached maximum git lfs bandwidth and the limit resets in ~30 days. will think of a potential solution |


currently im using my own github actions builds of bitsandbytes as the main bitsandbytes builds have multiple issues with rocm (not supporting all architectures, and the ones i mentioned in the repo https://github.com/electron271/bitsandbytes-index)
once bitsandbytes-foundation/bitsandbytes#1519 is fixed this can be changed
closes #37