Removing alloca on GPU #618

DiamonDinoia · 2025-01-30T20:53:28Z

@paquiteau and @Lenoush have noticed that alloca made things slower in their benchmarks while greatly reducing memory consumption.

Details are in #570 and mind-inria/mri-nufft-benchmark#5

Instead of using opts.gpu_* to switch with the old implementation it is better to use kernel dispatch and have pre-compiled kernels for the various scenarios. As per CPU code. One less parameter that the user has to worry about and it can obtain both higher performance and low memory consumption at the same time.

@paquiteau, @Lenoush can you benchmark this branch and let us know how it fares? I could not measure a meaningful difference with my custom code.

paquiteau · 2025-01-31T08:58:01Z

Hello @DiamonDinoia ! Interesting stuff :)
I will have a look in the next coming days with @chaithyagr as well

PS: @Lenoush's contract ended so she does not work on nuffts anymore

DiamonDinoia added 4 commits January 30, 2025 14:39

removing alloca in 1D

ba53244

Applied to 2D and 3D

7ea80c5

re-added comments

51c8f0a

using templates in kernel

d555578

DiamonDinoia requested a review from blackwer January 30, 2025 20:53

DiamonDinoia added 2 commits January 30, 2025 15:59

cleanup

00fa73e

updated changelog

d7f1f85

DiamonDinoia changed the title ~~Removing alloca to GPU~~ Removing alloca on GPU Jan 30, 2025

DiamonDinoia requested a review from janden January 30, 2025 21:02

shmem should be set once

5dc2058

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Removing alloca on GPU #618

Removing alloca on GPU #618

DiamonDinoia commented Jan 30, 2025

paquiteau commented Jan 31, 2025 •

edited

Loading

Removing alloca on GPU #618

Are you sure you want to change the base?

Removing alloca on GPU #618

Conversation

DiamonDinoia commented Jan 30, 2025

paquiteau commented Jan 31, 2025 • edited Loading

paquiteau commented Jan 31, 2025 •

edited

Loading