iirc x86_64 without avx512 is the only major isa that uses full-masks instead of bit-masks and has gather, so we should match it.
https://www.felixcloutier.com/x86/vgatherdps:vgatherqps#vgatherqps--vex-128-version-iirc
vgatherqps gathers f32 values using i32 mask elements and u64 indexes/addresses.
originally mentioned here:
#322 (comment)