Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CUBLAS_STATUS_EXECUTION_FAILED #124

Open
GiacomoDG96 opened this issue Mar 15, 2024 · 3 comments
Open

CUBLAS_STATUS_EXECUTION_FAILED #124

GiacomoDG96 opened this issue Mar 15, 2024 · 3 comments

Comments

@GiacomoDG96
Copy link

Hi, I am trying to replicate the example https://github.com/pyscf/gpu4pyscf/blob/master/examples/00-h2o.py using a benzene molecule instead of water and I am obtaining the same error as replicating the https://github.com/pyscf/gpu4pyscf/blob/master/examples/07-transition_state.py example with the molecule define in that file.

The error that I obtain is:
#########################################################################################
Traceback (most recent call last):
File "/home/soralakers96/anaconda3/envs/trail_actmol/lib/python3.12/site-packages/gpu4pyscf/df/df_jk.py", line 63, in init_workflow
rks.initialize_grids(mf, mf.mol, dm0)
File "/home/soralakers96/anaconda3/envs/trail_actmol/lib/python3.12/site-packages/gpu4pyscf/dft/rks.py", line 83, in initialize_grids
ks.grids = prune_small_rho_grids_(ks, ks.mol, dm, ks.grids)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/soralakers96/anaconda3/envs/trail_actmol/lib/python3.12/site-packages/gpu4pyscf/dft/rks.py", line 39, in prune_small_rho_grids_
rho = ks._numint.get_rho(mol, dm, grids, ks.max_memory)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/soralakers96/anaconda3/envs/trail_actmol/lib/python3.12/site-packages/gpu4pyscf/dft/numint.py", line 721, in get_rho
rho[p0:p1] = eval_rho2(mol, ao_mask, mo_coeff_mask, mo_occ, None, 'LDA', with_lapl)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/soralakers96/anaconda3/envs/trail_actmol/lib/python3.12/site-packages/gpu4pyscf/dft/numint.py", line 200, in eval_rho2
c0 = _dot_ao_dm(mol, ao, cpos, non0tab, shls_slice, ao_loc)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/soralakers96/anaconda3/envs/trail_actmol/lib/python3.12/site-packages/gpu4pyscf/dft/numint.py", line 1476, in _dot_ao_dm
return cupy.dot(dm.T, ao)
^^^^^^^^^^^^^^^^^^
File "/home/soralakers96/anaconda3/envs/trail_actmol/lib/python3.12/site-packages/cupy/linalg/_product.py", line 63, in dot
return a.dot(b, out)
^^^^^^^^^^^^^
File "cupy/_core/core.pyx", line 1757, in cupy._core.core._ndarray_base.dot
File "cupy/_core/_routines_linalg.pyx", line 536, in cupy._core._routines_linalg.dot
File "cupy/_core/_routines_linalg.pyx", line 626, in cupy._core._routines_linalg.tensordot_core
File "cupy/_core/_routines_linalg.pyx", line 763, in cupy._core._routines_linalg.tensordot_core_v11
File "cupy_backends/cuda/libs/cublas.pyx", line 1426, in cupy_backends.cuda.libs.cublas.gemmEx
File "cupy_backends/cuda/libs/cublas.pyx", line 1454, in cupy_backends.cuda.libs.cublas.gemmEx
File "cupy_backends/cuda/libs/cublas.pyx", line 438, in cupy_backends.cuda.libs.cublas.check_status
cupy_backends.cuda.libs.cublas.CUBLASError: CUBLAS_STATUS_NOT_INITIALIZED

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/home/soralakers96/anaconda3/envs/trail_actmol/lib/python3.12/site-packages/pyscf/lib/misc.py", line 1104, in exit
handler.result()
File "/home/soralakers96/anaconda3/envs/trail_actmol/lib/python3.12/concurrent/futures/_base.py", line 456, in result
return self.__get_result()
^^^^^^^^^^^^^^^^^^^
File "/home/soralakers96/anaconda3/envs/trail_actmol/lib/python3.12/concurrent/futures/_base.py", line 401, in __get_result
raise self._exception
File "/home/soralakers96/anaconda3/envs/trail_actmol/lib/python3.12/concurrent/futures/thread.py", line 58, in run
result = self.fn(*self.args, **self.kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/soralakers96/anaconda3/envs/trail_actmol/lib/python3.12/site-packages/gpu4pyscf/df/df_jk.py", line 43, in build_df
mf.with_df.build()
File "/home/soralakers96/anaconda3/envs/trail_actmol/lib/python3.12/site-packages/gpu4pyscf/df/df.py", line 90, in build
self._cderi = cholesky_eri_gpu(intopt, mol, auxmol, self.cd_low, omega=omega)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/soralakers96/anaconda3/envs/trail_actmol/lib/python3.12/site-packages/gpu4pyscf/df/df.py", line 265, in cholesky_eri_gpu
cderi_block = solve_triangular(cd_low, ints_slices, lower=True, overwrite_b=False)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/soralakers96/anaconda3/envs/trail_actmol/lib/python3.12/site-packages/cupyx/scipy/linalg/_solve_triangular.py", line 97, in solve_triangular
trsm(
File "cupy_backends/cuda/libs/cublas.pyx", line 1109, in cupy_backends.cuda.libs.cublas.dtrsm
File "cupy_backends/cuda/libs/cublas.pyx", line 1119, in cupy_backends.cuda.libs.cublas.dtrsm
File "cupy_backends/cuda/libs/cublas.pyx", line 438, in cupy_backends.cuda.libs.cublas.check_status
cupy_backends.cuda.libs.cublas.CUBLASError: CUBLAS_STATUS_EXECUTION_FAILED

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/home/soralakers96/CODE/gpu4pyscf/gpu4pyscf/examples/07-transition_state.py", line 68, in
mf_GPU.kernel()
File "/home/soralakers96/anaconda3/envs/trail_actmol/lib/python3.12/site-packages/gpu4pyscf/scf/hf.py", line 588, in scf
_kernel(mf, mf.conv_tol, mf.conv_tol_grad,
File "/home/soralakers96/anaconda3/envs/trail_actmol/lib/python3.12/site-packages/gpu4pyscf/scf/hf.py", line 404, in _kernel
mf.init_workflow(dm0=dm)
File "/home/soralakers96/anaconda3/envs/trail_actmol/lib/python3.12/site-packages/gpu4pyscf/df/df_jk.py", line 56, in init_workflow
with lib.call_in_background(build_df) as build:
File "/home/soralakers96/anaconda3/envs/trail_actmol/lib/python3.12/site-packages/pyscf/lib/misc.py", line 1106, in exit
raise ThreadRuntimeError('Error on thread %s:\n%s' % (self, e))
pyscf.lib.misc.ThreadRuntimeError: Error on thread <pyscf.lib.misc.call_in_background object at 0x7f5772b63dd0>:
CUBLAS_STATUS_EXECUTION_FAILED
########################################################################################

I am using NVIDIA L40 with the pre-compiled version pip3 install gpu4pyscf-cuda12x.

@wxj6000
Copy link
Collaborator

wxj6000 commented Mar 15, 2024

It seems that CuPy didn't find cuBLAS. Can you make sure CUDA Toolkit is installed in your system? If installed, you can check out if cupy.dot works properly.

@GiacomoDG96
Copy link
Author

CUDA Toolkit is installed.
When I run nvcc --version I obtain:

nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2023 NVIDIA Corporation
Built on Fri_Nov__3_17:16:49_PDT_2023
Cuda compilation tools, release 12.3, V12.3.103
Build cuda_12.3.r12.3/compiler.33492891_0

I have also tried cupy.dot with a toy example and it works.

@wxj6000
Copy link
Collaborator

wxj6000 commented Mar 18, 2024

@GiacomoDG96 OK, great.
Possibly, GPU doesn't have enough space for cublas handle. Can you try to limit CuPy memory pool?
https://docs.cupy.dev/en/stable/user_guide/memory.html#limiting-gpu-memory-usage

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants