Skip to content

Unable to run CUDA on HPC cluster #3004

@utkarsh530

Description

@utkarsh530

Describe the bug

I am trying to use CUDA.jl on MIT Supercloud, however, I am not able to use it with local toolchain even after following the steps here https://cuda.juliagpu.org/stable/installation/overview/#Precompiling-CUDA.jl-without-CUDA. (Setting runtime to use 12.6 and local installation.)

To reproduce

The Minimal Working Example (MWE) for this bug:

# some code here
using CUDA
cu(rand(100))
julia> cu(zeros(100))
ERROR: CUDA error: unknown error (code 999, ERROR_UNKNOWN)
Stacktrace:
  [1] throw_api_error(res::CUDA.cudaError_enum)
    @ CUDA ~/.julia/packages/CUDA/x8d2s/lib/cudadrv/libcuda.jl:30
  [2] check
    @ ~/.julia/packages/CUDA/x8d2s/lib/cudadrv/libcuda.jl:37 [inlined]
  [3] cuDevicePrimaryCtxRetain
    @ ~/.julia/packages/GPUToolbox/JLBB1/src/ccalls.jl:33 [inlined]
  [4] CuContext(pctx::CuPrimaryContext)
    @ CUDA ~/.julia/packages/CUDA/x8d2s/lib/cudadrv/context.jl:197
  [5] context(dev::CuDevice)
    @ CUDA ~/.julia/packages/CUDA/x8d2s/lib/cudadrv/state.jl:238
  [6] TaskLocalState (repeats 2 times)
    @ ~/.julia/packages/CUDA/x8d2s/lib/cudadrv/state.jl:50 [inlined]
  [7] task_local_state!()
    @ CUDA ~/.julia/packages/CUDA/x8d2s/lib/cudadrv/state.jl:79
  [8] device
    @ ~/.julia/packages/CUDA/x8d2s/lib/cudadrv/state.jl:189 [inlined]
  [9] CuArray{Float32, 1, CUDA.DeviceMemory}(::UndefInitializer, dims::Tuple{Int64})
    @ CUDA ~/.julia/packages/CUDA/x8d2s/src/array.jl:91
 [10] CuArray
    @ ~/.julia/packages/CUDA/x8d2s/src/array.jl:437 [inlined]
 [11] adapt_storage(::CUDA.CuArrayKernelAdaptor{CUDA.DeviceMemory}, xs::Vector{Float64})
    @ CUDA ~/.julia/packages/CUDA/x8d2s/src/array.jl:753
 [12] adapt_structure
    @ ~/.julia/packages/Adapt/2UZ81/src/Adapt.jl:57 [inlined]
 [13] adapt
    @ ~/.julia/packages/Adapt/2UZ81/src/Adapt.jl:40 [inlined]
 [14] #cu#1174
    @ ~/.julia/packages/CUDA/x8d2s/src/array.jl:818 [inlined]
 [15] cu(xs::Vector{Float64})
    @ CUDA ~/.julia/packages/CUDA/x8d2s/src/array.jl:805
 [16] top-level scope
    @ REPL[5]:1

Expected behavior

A clear and concise description of what you expected to happen.

Version info

Details on Julia:

# please post the output of:
versioninfo()
Julia Version 1.11.8
Commit cf1da5e20e3 (2025-11-06 17:49 UTC)
Build Info:
  Official https://julialang.org/ release
Platform Info:
  OS: Linux (x86_64-linux-gnu)
  CPU: 80 × Intel(R) Xeon(R) Gold 6248 CPU @ 2.50GHz
  WORD_SIZE: 64
  LLVM: libLLVM-16.0.6 (ORCJIT, cascadelake)
Threads: 2 default, 0 interactive, 1 GC (on 80 virtual cores)
Environment:
  JULIA_CUDA_USE_BINARYBUILDER = false
  LD_LIBRARY_PATH = /usr/local/pkg/cuda/cuda-12.6/lib64:/usr/local/pkg/cuda/cuda-12.6/cuda/lib64
  JULIA_REVISE_POLL = 1

Details on CUDA:

# please post the output of:
CUDA.versioninfo()

julia> CUDA.versioninfo()
CUDA toolchain: 
- runtime 12.6, local installation
- driver 580.95.5 for 13.1
- compiler 12.9

CUDA libraries: 
- CUBLAS: 12.6.0
- CURAND: 10.3.7
- CUFFT: 11.2.6
- CUSOLVER: 11.6.4
- CUSPARSE: 12.5.2
- CUPTI: 2024.3.0 (API 12.6.0)
- NVML: 13.0.0+580.95.5

Julia packages: 
- CUDA: 5.9.5
- CUDA_Driver_jll: 13.1.0+0
- CUDA_Compiler_jll: 0.3.0+1
- CUDA_Runtime_jll: 0.19.2+0
- CUDA_Runtime_Discovery: 1.0.0

Toolchain:
- Julia: 1.11.8
- LLVM: 16.0.6

Environment:
- JULIA_CUDA_USE_BINARYBUILDER: false

Preferences:
- CUDA_Runtime_jll.version: 12.6
- CUDA_Runtime_jll.local: true

1 device:
  0: Tesla V100-PCIE-32GB (sm_70, 31.729 GiB / 32.000 GiB available)

Additional context

Add any other context about the problem here.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions