-
Notifications
You must be signed in to change notification settings - Fork 2
Draft: Matrix Transpose Lesson 11 for Kernel #40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| RAJA::View<double, RAJA::Layout<DIM>> A_t(a_t, M, N); | ||
|
|
||
| RAJA::TypedRangeSegment<int> row_range(0, N); | ||
| RAJA::TypedRangeSegment<int> col_range(0, M); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where is the TODO part of the exercise?
| RAJA::KernelPolicy< | ||
| RAJA::statement::CudaKernel< | ||
| RAJA::statement::For<1, RAJA::cuda_thread_y_loop, | ||
| RAJA::statement::For<0, RAJA::cuda_thread_x_loop, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should coordinate with @johnbowen42 to make sure we use the same policies
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also think that it would be useful to explain the basic differences between 'loop' and 'direct' policies. We've had issues in the past when users apply loop policies following our basic examples and see performance that is less than expected. I would recommend to not use loop policies for an introduction.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should use the global policies here, and part of slides could be an overview of different options. I was planning include that in my slide deck.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like that idea.
|
We need a companion readme in this PR |
| RAJA::KernelPolicy< | ||
| RAJA::statement::CudaKernel< | ||
| RAJA::statement::For<1, RAJA::cuda_global_size_y_loop<8>, | ||
| RAJA::statement::For<0, RAJA::cuda_global_size_x_direct<32>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you think we will have to explain the block sizes at the tutorial?
|
I think we can close this PR, we have decided to leave this lesson out of the intro tutorial. |
|
No longer needed |
No description provided.