Skip to content

EPIC: Support TMA descriptor #199

@leofang

Description

@leofang

Initializing a TMA descriptor through the driver APIs
https://docs.nvidia.com/cuda/cuda-driver-api/group__CUDA__TENSOR__MEMORY.html
is really tedious and error prone. We need a way to abstract it out, which aligns well with the mission of cuda.core. This also allows JIT compilers to easier consume and incorporate into the compilation pipelines.

In my understanding there are two (implicit?) requirements for this to be useful:

  1. Creating/initializing a TMA object on host
  2. Passing the object to the cuda.core.launch() API as a kernel arg

Sub-issues

Metadata

Metadata

Assignees

No one assigned

    Labels

    P1Medium priority - Should docuda.coreEverything related to the cuda.core modulefeatureNew feature or request

    Type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions