Description
The point of simd_gather/scatter is that I have a vector of pointers and a mask, and only the pointers that are "enabled" will actually be used. The others may point to garbage memory or memory that is being concurrently read/written by other threads or whatever, they must not be touched.
If I understand the implementations for these intrinsics in compiler/rustc_codegen_gcc/src/intrinsic/simd.rs
correctly, then they currently always read and write all the pointers, and uses shuffle
to decide which values to keep. Apart from the fact that this does a bunch of sequential loads and stores (completely losing the SIMD effect, making me wonder why a shuffle
is used when some basic if-then-else would be a lot simpler), this will make programs that use simd_gather/scatter produce the wrong behavior in quite subtle ways.
As a comparatively minor issue that is not worth tracking separately, the implementation also seems to assume that the length of the vector is a power of 2.