You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
T1_l_float[iS11{4}, iS13{64}]
logical domain : (iS1{128})
contiguity: t
Outer split: iS1{128} by factor 4 -> iS11{4}, iS12{32}
Resize: iS12{32} by 0 and 32 -> iS13{64}
loop domain : (iS11{4}, iS13{64})
The loop indices of i0 and i2 correspond to the loop IDs of iS11 and iS13, respectively. We predicate the logical domain of T1 by generating the corresponding index for iS1.
While this should have no issue with the correctness, half of the accesses would be redundant as they would be masked out by the actual pad expression:
This is because of the split of the reshape. Due to the split, indices that exceed the resize input extent of iS12 may not be predicated out. This is quite similar to the non-divisible split, but because of the resize it can happen even with divisible splits like this case.
I don't think this is a correctness issue, but it's likely a performance issue unless nvrtc is smart enough to figure out the accesses are indeed redundant and eliminate the redundant accesses, which is unlikely.
The text was updated successfully, but these errors were encountered:
Just realized a pad may be scheduled in a redundant way with the resize scheduler. For example:
Here, it just manually reproduces what the resize scheduler would do by propagating the resize of the pad. The generated kernel would look like:
Here, the important part is the predicate of the
T0
read:This is because
T1
is scheduled as:The loop indices of
i0
andi2
correspond to the loop IDs ofiS11
andiS13
, respectively. We predicate the logical domain ofT1
by generating the corresponding index foriS1
.While this should have no issue with the correctness, half of the accesses would be redundant as they would be masked out by the actual pad expression:
This is because of the split of the reshape. Due to the split, indices that exceed the resize input extent of
iS12
may not be predicated out. This is quite similar to the non-divisible split, but because of the resize it can happen even with divisible splits like this case.I don't think this is a correctness issue, but it's likely a performance issue unless nvrtc is smart enough to figure out the accesses are indeed redundant and eliminate the redundant accesses, which is unlikely.
The text was updated successfully, but these errors were encountered: