Draft: Matrix Transpose Lesson 11 for Kernel #40

rchen20 · 2025-06-30T22:58:53Z

No description provided.

artv3 · 2025-07-01T00:09:39Z

Intro_Tutorial/lessons/11_raja_device_kernel/11_raja_transpose_kernel.cpp

+  RAJA::View<double, RAJA::Layout<DIM>> A_t(a_t, M, N);
+
+  RAJA::TypedRangeSegment<int> row_range(0, N);
+  RAJA::TypedRangeSegment<int> col_range(0, M);


Where is the TODO part of the exercise?

artv3 · 2025-07-01T16:16:36Z

Intro_Tutorial/lessons/11_raja_device_kernel/solution/11_raja_transpose_kernel_solution.cpp

+      RAJA::KernelPolicy<
+        RAJA::statement::CudaKernel<
+          RAJA::statement::For<1, RAJA::cuda_thread_y_loop,
+	          RAJA::statement::For<0, RAJA::cuda_thread_x_loop,


we should coordinate with @johnbowen42 to make sure we use the same policies

I also think that it would be useful to explain the basic differences between 'loop' and 'direct' policies. We've had issues in the past when users apply loop policies following our basic examples and see performance that is less than expected. I would recommend to not use loop policies for an introduction.

I think we should use the global policies here, and part of slides could be an overview of different options. I was planning include that in my slide deck.

I like that idea.

artv3 · 2025-07-01T16:17:00Z

We need a companion readme in this PR

artv3 · 2025-07-07T20:16:34Z

Intro_Tutorial/lessons/11_raja_device_kernel/11_raja_transpose_kernel.cpp

+      RAJA::KernelPolicy<
+        RAJA::statement::CudaKernel<
+          RAJA::statement::For<1, RAJA::cuda_global_size_y_loop<8>,
+	          RAJA::statement::For<0, RAJA::cuda_global_size_x_direct<32>,


Do you think we will have to explain the block sizes at the tutorial?

artv3 · 2025-07-12T20:07:10Z

I think we can close this PR, we have decided to leave this lesson out of the intro tutorial.

artv3 · 2025-07-16T23:08:56Z

No longer needed

Preliminary kernel transpose for lesson 11.

f76c892

rchen20 requested review from artv3, johnbowen42 and kab163 June 30, 2025 22:58

artv3 reviewed Jul 1, 2025

View reviewed changes

Use global policies.

7198498

artv3 reviewed Jul 7, 2025

View reviewed changes

artv3 closed this Jul 16, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Draft: Matrix Transpose Lesson 11 for Kernel #40

Draft: Matrix Transpose Lesson 11 for Kernel #40

Uh oh!

rchen20 commented Jun 30, 2025

Uh oh!

artv3 Jul 1, 2025

Uh oh!

artv3 Jul 1, 2025

Uh oh!

rhornung67 Jul 1, 2025

Uh oh!

artv3 Jul 1, 2025

Uh oh!

rhornung67 Jul 1, 2025

Uh oh!

artv3 commented Jul 1, 2025

Uh oh!

artv3 Jul 7, 2025

Uh oh!

artv3 commented Jul 12, 2025

Uh oh!

artv3 commented Jul 16, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Draft: Matrix Transpose Lesson 11 for Kernel #40

Draft: Matrix Transpose Lesson 11 for Kernel #40

Uh oh!

Conversation

rchen20 commented Jun 30, 2025

Uh oh!

artv3 Jul 1, 2025

Choose a reason for hiding this comment

Uh oh!

artv3 Jul 1, 2025

Choose a reason for hiding this comment

Uh oh!

rhornung67 Jul 1, 2025

Choose a reason for hiding this comment

Uh oh!

artv3 Jul 1, 2025

Choose a reason for hiding this comment

Uh oh!

rhornung67 Jul 1, 2025

Choose a reason for hiding this comment

Uh oh!

artv3 commented Jul 1, 2025

Uh oh!

artv3 Jul 7, 2025

Choose a reason for hiding this comment

Uh oh!

artv3 commented Jul 12, 2025

Uh oh!

artv3 commented Jul 16, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants