Bug Report: Off-by-One Indexing in cufinufft
GPU Kernel
#672
DiamonDinoia
started this conversation in
General
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Issue
A long-standing bug in the
interval
function caused out-of-bound access in CUDA kernel arrays during spreading/interpolation. It was very hard to trigger — it required a non-uniform point to fall on a grid point to floating-point accuracy, i.e.,x + ns/2
exactly an integer. In that case, the computed support window becamens + 1
wide instead ofns
, resulting in illegal access toker[ns]
.Before summer 2024, this was masked by kernel arrays padded to
MAX_NSPREAD = 16
, so out-of-bound reads rarely affected anything unlessns = 16
. After kernel arrays were resized to exactlyns
, the bug became real, with invalid memory reads introducing small but nonzero garbage into the output. This mainly affected GPUtype 3
transforms at tighter tolerances (eps = 1e-8
), especially withupsampfac = 1.25
.Cause
interval()
returns a window of lengthns + 1
whenx + ns/2
is an integer (float precision)ker[ns]
, out of bounds forT ker[ns]
ker3[0]
)Fix
interval()
logic to enforce window size =ns
ker[ns]
accessceil()
fix fornf
computation on GPUValidation
type 3
transform errors resolvedkerevalmeth=1
:1e-15
to1e-10
2.31e-10
PR
Beta Was this translation helpful? Give feedback.
All reactions