Skip to content

[BUG]: cublasmp dependencies are not reflected in supported_nvidia_libs.py #1116

@rwgk

Description

@rwgk

Discovered by chance while testing on a workstation that did not have the CUDA driver installed:

  • libcublasmp.so.0 is the only supported lib that requires libcuda.so.1, which led to a test_load_nvidia_dynamic_lib.py failure when the driver was not installed.
  • To double-check I ran the ldd, which made it obvious that we are missing all dependencies: cublas, cublasLt, nvshmem_host, nccl

The dependency on libcuda.so.1 is unusual: not sure if we want to do something about it. But the other dependencies should be added to supported_nvidia_libs.py.

mgx-c2g2-pvt-66.cl1u1.colossus.nvidia.com:/wrk/forked/cuda-python/cuda_pathfinder $ ldd /wrk/forked/cuda-python/cuda_pathfinder/.pixi/envs/cu12-linux-aarch64/lib/python3.14/site-packages/nvidia/cublasmp/cu12/lib/libcublasmp.so.0
        linux-vdso.so.1 (0x0000fb3f18b6a000)
        libcuda.so.1 => /lib/aarch64-linux-gnu/libcuda.so.1 (0x0000fb3f11a00000)
        libcublas.so.12 => /wrk/forked/cuda-python/cuda_pathfinder/.pixi/envs/cu12-linux-aarch64/lib/python3.14/site-packages/nvidia/cublasmp/cu12/lib/../../../cublas/lib/libcublas.so.12 (0x0000fb3f0b600000)
        libcublasLt.so.12 => /wrk/forked/cuda-python/cuda_pathfinder/.pixi/envs/cu12-linux-aarch64/lib/python3.14/site-packages/nvidia/cublasmp/cu12/lib/../../../cublas/lib/libcublasLt.so.12 (0x0000fb3edb800000)
        libnvshmem_host.so.3 => /wrk/forked/cuda-python/cuda_pathfinder/.pixi/envs/cu12-linux-aarch64/lib/python3.14/site-packages/nvidia/cublasmp/cu12/lib/../../../nvshmem/lib/libnvshmem_host.so.3 (0x0000fb3ed2000000)
        libnccl.so.2 => /wrk/forked/cuda-python/cuda_pathfinder/.pixi/envs/cu12-linux-aarch64/lib/python3.14/site-packages/nvidia/cublasmp/cu12/lib/../../../nccl/lib/libnccl.so.2 (0x0000fb3ebb800000)
        librt.so.1 => /lib/aarch64-linux-gnu/librt.so.1 (0x0000fb3f18af0000)
        libpthread.so.0 => /lib/aarch64-linux-gnu/libpthread.so.0 (0x0000fb3f18ac0000)
        libdl.so.2 => /lib/aarch64-linux-gnu/libdl.so.2 (0x0000fb3f18a90000)
        libstdc++.so.6 => /lib/aarch64-linux-gnu/libstdc++.so.6 (0x0000fb3ebb400000)
        libm.so.6 => /lib/aarch64-linux-gnu/libm.so.6 (0x0000fb3f17750000)
        libgcc_s.so.1 => /lib/aarch64-linux-gnu/libgcc_s.so.1 (0x0000fb3f18a50000)
        libc.so.6 => /lib/aarch64-linux-gnu/libc.so.6 (0x0000fb3ebb240000)
        /lib/ld-linux-aarch64.so.1 (0x0000fb3f18b2d000)

Metadata

Metadata

Assignees

Labels

bugSomething isn't workingcuda.pathfinderEverything related to the cuda.pathfinder module

Type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions