You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The idea is to allow codegen to exploit specific known constants. It
seems that LLVM by itself will not generate split loop blocks by an
input argument that is a loop constant. Whereas we know, from PNG, that
some constants yield much better code. In this case
coef[i] = coef[i] + coef[i - n]
This has a destructive dependency chain for `n = 1`, should never get an
argument of 0 and is almost embarrassingly parallel in SIMD if n >= 8
where we can increase the amount of data loaded at once for each
independent loop iteration.
By splitting the loop we make the compiler apply independent
optimization passes to each of case, then have a fallback for things we
did not cover with vectorized possibilities.
0 commit comments