Skip to content

Avoid referencing the current device in LaunchConfig #704

@leofang

Description

@leofang

This is strictly speaking not 100% safe:

# FIXME: Calling Device() strictly speaking is not quite right; we should instead
# look up the device from stream. We probably need to defer the checks related to
# device compute capability or attributes.
# thread block clusters are supported starting H100
if self.cluster is not None:
if not _use_ex:
err, drvers = driver.cuDriverGetVersion()
drvers_fmt = f" (got driver version {drvers})" if err == driver.CUresult.CUDA_SUCCESS else ""
raise CUDAError(f"thread block clusters require cuda.bindings & driver 11.8+{drvers_fmt}")
cc = Device().compute_capability
if cc < (9, 0):
raise CUDAError(
f"thread block clusters are not supported on devices with compute capability < 9.0 (got {cc})"
)
self.cluster = cast_to_3_tuple("LaunchConfig.cluster", self.cluster)
if self.shmem_size is None:
self.shmem_size = 0
if self.cooperative_launch and not Device().properties.cooperative_launch:
raise CUDAError("cooperative kernels are not supported on this device")

because we could be changing the device after creating a launch config and before we use it to launch a kernel (on another device).

Metadata

Metadata

Assignees

No one assigned

    Labels

    P1Medium priority - Should docuda.coreEverything related to the cuda.core moduleenhancementAny code-related improvements

    Type

    No type

    Projects

    Status

    Todo

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions