Add scalable vector type to JIT and HFA type for Vector<T> #121114
+1,069
−613
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This branch introduces a new type (
TYP_SIMDSV) to the JIT for supporting scalable vectors, registers whose size is determined by hardware at runtime but remains constant for the duration of a process. For ARM64, this means we have vectors sized in powers of 2 from 128 bits up to 2048 bits depending on hardware implementation, with an instruction available to query this size for compiler use. I've also adjusted the implementation ofTYP_MASKto scale with the vector length on ARM64 in a similar manner.This builds on and borrows much of Kunal's work in: #115948.
This PR focuses on enabling scalable type awareness as a foundation for future vector length agnostic code generation. I've refactored existing systems with a way of retrieving the size of the type with access to the compiler runtime state. This mainly involves refactoring areas that depend on
genTypeSizeto call a new instance methodCompiler::getSizeOfType. This is allowing the JIT to emit SVE register moves, loads and stores forVector<T>etc. but doesn't change the implementation of theVector<T>API surface. It still emits NEON for arithmetic operations, logical operations, floating point operations and so on. The codegen is functionally equivalent while the vector length is set to 128 bits.With this type being passed around, we can begin implementing vector length agnostic code in subsequent work, as we can now test for a
TYP_SIMDSVand distinguish it from a fixed sizeTYP_SIMD16.Testing
Importing
Vector<T>as the new type is gated behindDOTNET_JitUseScalableVectorT, meaningTYP_SIMDSVwill not appear in compilation unless that variable is set. Likewise, the VM will not report use of the HFA typeCORINFO_HFA_ELEM_VECTORTunless this variable is set. With the variable set, testing is validating the implementation of the new type. Without the variable set, testing is validating thatTYP_SIMD16behavior remains stable under these changes.SuperPMI method contexts are currently out of sync with the updated JIT-EE interface, which causes mismatches in return values between the JIT and EE. I don't expect this to stop until
DOTNET_JitUseScalableVectorTis removed and standardized, so this feature will need to be tested with some specially generated MCH files while being developed.Future
We will be able to remove
DOTNET_JitUseScalableVectorTonceVector<T>is working in AOT compilation. This will require all phases to be aware thatTYP_SIMDSVis dynamically sized and make the choice if the pass can run or not depending on this.For a transitional approach, we could allow specifying a fixed target vector length for AOT compilation while broader vector-length agnostic support is implemented. In some cases we might be able to take advantage of knowing the vector size at compilation time, so having both approaches available might be advantageous for JIT mode.
Code Example
With
DOTNET_JitUseScalableVectorT=1, the main vector loop body compiles to:Contributing towards #120599