Skip to content

E: Load and store callbacks for GPU plans #8

@davebayer

Description

@davebayer

Enable load and store callbacks for GPU plans. It shall be done using real-time compilation of given source code. The source shall contain a specially named function with predefined signature copying the concept of cuFFT callbacks and optionally a afft_CallbackData structure with user's data.

Example of potential load and store callback signatures for CUDA gpu backend:

extern "C"
{
  struct afft_CallbackData // optional, may be undefined
  {
    // ...
  };

  __device__ cuComplex afft_loadComplexF32Interleaved(const cuComplex*   src,
                                                      size_t             offset,
                                                      afft_CallbackData* callbackData,
                                                      void*              sharedMem)
  {
    return src[offset];
  }

  __device__ cuComplex afft_loadComplexF32Planar(const float*       srcReal,
                                                 const float*       srcImag,
                                                 size_t             offset,
                                                 afft_CallbackData* callbackData,
                                                 void*              sharedMem)
  {
    cuComplex value;

    value.x = srcReal[offset];
    value.y = srcImag[offset];

    return value;
  }

  __device__ void afft_storeComplexF32Interleaved(cuComplex*         dst,
                                                  size_t             offset,
                                                  cuComplex          value,
                                                  afft_CallbackData* callbackData,
                                                  void*              sharedMem)
  {
    dst[offset] = value;
  }

  __device__ void afft_storeComplexF32Planar(float*             dstReal,
                                             float*             dstImag,
                                             size_t             offset,
                                             cuComplex          value,
                                             afft_CallbackData* callbackData,
                                             void*              sharedMem)
  {
    dstReal[offset] = value.x;
    dstReal[offset] = value.y;
  }
}

The callback code shall be passed to the plan inside afft::[distrib]::gpu::Parameters like:

namespace afft::spst::gpu
{
  struct Parameters
  {
    // ...

    afft::gpu::Callback loadCallback{};
    afft::gpu::Callback storeCallback{};
  };
} // namespace afft::spst::gpu

via afft::gpu::Callback structure:

namespace afft::gpu
{
  struct Callback
  {
    std::string_view code{};
    void*            callbackData{};
    std::size_t      sharedMemSize{};
  };
} // namespace afft::gpu

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions