-
Notifications
You must be signed in to change notification settings - Fork 41
Fix policy ordering for CUDA and HIP policies #64
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Conversation
|
@rchen20 @MrBurmark Can you help figure out what these policies should look like? There is definitely a copy paste error making all policies behave the same, but @michaelmckinsey1 wasn't sure he is fixing this correctly. This also seems to be different than the kernel in RAJAPerf. |
|
This does look like a copy paste that was never fixed up. Looking at other policies I can see that the layout is supposed to give the order of the loops, but that doesn't include the moment loop. Due to the tensor contraction nature of the kernel I would think that the moment loop always has to use a sequential policy for correctness. @rchen20 do you know how these are supposed to map to the GPU? |
|
I am not sure . . . let me show this to John Loffeld, he would know. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these changes are for testing
|
After discussion with John, there was a correct ordering some time before we added the HIP backend, and the error likely occurred as HIP policies were added. When I have some time, I'll go through commits and try to find the right one. |
We noticed some of the
CUDAandHIPpolicies appear to be identical, despite attempting to define different layouts. It would appear that all of these layouts currently execute theGZDpolicy.ArgumentIdnesting match the ordering of the sequential policy or does theExecPolicynesting also need to change?LPlusTimes.h, which I have not corrected here, and I have not checked the policies in other files.