RFC: Implement access::components higher-order range mapper #193
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
RFC, based on #182.
A common parallelization pattern, e.g. in dense matrix-vector products, is to map 1D thread-ids to the rows of a 2D matrix while iterating over all columns on each item. This currently requires a custom range mapper:
While we could just have
access::rows
andaccess::columns
convenience range mappers with all the confusion about row-major vs column-major vs fastest-dimensions that would entail, I have come up with a generic solution for the entire class of these "component mappings".Let me introduce the first higher-order range mapper,
access::components
. It constructs a(chunk<KernelDims>, range<BufferDims>) -> subrange<BufferDims>
range mapper fromBufferDims
individual mappers of type(chunk<KernelDim>, range<1>) -> subrange<1>
.Together with the new, straight-forward mapper
access::kernel_dim
that creates asubrange<1>
from a single kernel dimension, this allows us to re-write the custom range mapper above like so:Note that any other range mapper that produces
subrange<1>
can be used for each component, such asfixed
or.neighborhood