[midend/lib/Conversion] Add the vectorization pass of the linalg.batchmatmul_transpsoe_bop and examples. #477

FloatingcloudKnight · 2025-03-11T11:22:00Z

Add the vectorization pass of the linalg.batchmatmul_transpsoe_bop and examples.
The vectorized pass is in midend/lib/Conversion/MatMulOptimization/
Examples are in examples/BuddyMatmul/
The code after vectorization is

%c0 = arith.constant 0 : index
%c1 = arith.constant 1 : index
%c2 = arith.constant 2 : index
%vl_step = arith.constant 32 : index
%c0_f32 = arith.constant 0.000000e+00 : f32
%v0 = vector.splat %c0_f32 : vector<32xf32>
%dim = memref.dim %arg0, %c0 : memref<?x?x?xf32>
%dim_0 = memref.dim %arg0, %c1 : memref<?x?x?xf32>
%dim_1 = memref.dim %arg0, %c2 : memref<?x?x?xf32>
%dim_2 = memref.dim %arg1, %c1 : memref<?x?x?xf32>

%dim_1_upbound_tmp = arith.subi %dim_1, %vl_step : index
%dim_1_upbound = arith.addi %dim_1_upbound_tmp, %c1 : index

affine.for %arg3 = %c0 to %dim {
  affine.for %arg4 = %c0 to %dim_0 {
    affine.for %arg5 = %c0 to %dim_2 {
      %2 = affine.load %arg2[%arg3, %arg4, %arg5] : memref<?x?x?xf32>
      %iter_idx, %iter_value = scf.for %arg6 = %c0 to %dim_1_upbound 
          step %vl_step iter_args(%iter_init = %c0, %iter_value0 = %2) -> (index, f32){
        %0 =vector.load %arg0[%arg3, %arg4, %arg6] : memref<?x?x?xf32>, vector<32xf32>
        %1 = vector.load %arg1[%arg3, %arg5, %arg6] : memref<?x?x?xf32>, vector<32xf32>
        %3 = arith.mulf %0, %1 : vector<32xf32>
        %4 = vector.reduction <add>, %3, %iter_value0 fastmath<reassoc> : vector<32xf32> into f32
        %dim_1_next = arith.addi %arg6, %vl_step : index
        scf.yield %dim_1_next, %4 : index, f32
      }
      %tail_size = arith.subi %dim_1, %iter_idx : index
      %mask = vector.create_mask %tail_size : vector<32xi1>
      %0 = vector.maskedload %arg1[%arg3, %arg4, %iter_idx], %mask, %v0 : memref<?x?x?xf32>, vector<32xi1>, vector<32xf32> into vector<32xf32>
      %10 = vector.maskedload %arg1[%arg3, %arg5, %iter_idx], %mask, %v0 : memref<?x?x?xf32>, vector<32xi1>, vector<32xf32> into vector<32xf32>
      %3 = arith.mulf %0, %10 : vector<32xf32>
      %4 = vector.reduction <add>, %3, %iter_value fastmath<reassoc> : vector<32xf32> into f32
      affine.store %4, %arg2[%arg3, %arg4, %arg5] : memref<?x?x?xf32>
    }
  }
}

The test result in buddy-benchmark is

-----------------------------------------------------------------------------------------------------------
Benchmark                                                       Time             CPU   Iterations
------------------------------------------------------------------------------------------------------------
DL_OPS_BATCH_MATMUL_TRANSPOSE_B/Scalar_O0/iterations:1        249 ms          249 ms            1
DL_OPS_BATCH_MATMUL_TRANSPOSE_B/Scalar_O3/iterations:1       67.7 ms         67.7 ms            1
DL_OPS_BATCH_MATMUL_TRANSPOSE_B/Vec/iterations:1             8.51 ms         8.51 ms            1
---------- Verification -------------------------------------------------------------------------------------
Scalar_O3 PASS
Vec PASS

…hmatmul_transpsoe_bop and examples.

…nto fusion

FloatingcloudKnight added 3 commits March 10, 2025 12:25

[midend/lib/Conversion] Add the vectorization pass of the linalg.batc…

f658cf9

…hmatmul_transpsoe_bop and examples.

Merge branch 'main' of https://github.com/buddy-compiler/buddy-mlir i…

2182e39

…nto fusion

Fixing the file format

9448fb9

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[midend/lib/Conversion] Add the vectorization pass of the linalg.batchmatmul_transpsoe_bop and examples. #477

[midend/lib/Conversion] Add the vectorization pass of the linalg.batchmatmul_transpsoe_bop and examples. #477

FloatingcloudKnight commented Mar 11, 2025 •

edited

Loading

[midend/lib/Conversion] Add the vectorization pass of the linalg.batchmatmul_transpsoe_bop and examples. #477

Are you sure you want to change the base?

[midend/lib/Conversion] Add the vectorization pass of the linalg.batchmatmul_transpsoe_bop and examples. #477

Conversation

FloatingcloudKnight commented Mar 11, 2025 • edited Loading

FloatingcloudKnight commented Mar 11, 2025 •

edited

Loading