Skip to content

Add vscale_range attribute to functions with sve #58433

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

gbaraldi
Copy link
Member

For some reason LLVM expects frontends to do this intead of querying the TTI. So just add a default when it makes sense. ( I wonder if this should be done elsewhere but this pass seems like it can handle this fine and everything is already there)

FS.split(Features, ',');
for (StringRef Feature : Features) {
if (Feature == "sve")
F.addFnAttr(llvm::Attribute::getWithVScaleRangeArgs(M.getContext(), 1, 16)); //Hardcode for now
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is relevant for risc-v, too. I've got access to a chip with scalable vector registers (RVV extension), now, if only we could compile julia for that architecture: #57569 😇

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

@gbaraldi gbaraldi May 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't look at how that function is implemented, you will find something similar :)

@giordano
Copy link
Contributor

julia> function axpy!(a, x, y)
           @simd for idx in eachindex(x, y)
               @inbounds y[idx] = muladd(a, x[idx], y[idx])
           end
       end
axpy! (generic function with 1 method)

julia> open("axpy.ll", "w") do file code_llvm(file, axpy!, (Float64, Vector{Float64}, Vector{Float64}); raw=true, dump_module=true) end
; ModuleID = 'axpy!'
source_filename = "axpy!"
target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i8:8:32-i16:16:32-i64:64-i128:128-n32:64-S128-Fn32"
target triple = "aarch64-unknown-linux-gnu"

@jl_nothing = external local_unnamed_addr constant ptr

@"jl_global#3608.jit" = private alias ptr, inttoptr (i64 281472983889952 to ptr)

; Function Signature: axpy!(Float64, Array{Float64, 1}, Array{Float64, 1})
;  @ REPL[7]:1 within `axpy!`
define swiftcc void @"julia_axpy!_3603"(ptr nonnull swiftself %pgcstack_arg, double %"a::Float64", ptr noundef nonnull align 8 dereferenceable(24) %"x::Array", ptr noundef nonnull align 8 dereferenceable(24) %"y::Array") local_unnamed_addr #0 !dbg !4 {
top:
  %"new::OneTo" = alloca [1 x i64], align 8
  %"new::OneTo1" = alloca [1 x i64], align 8
    #dbg_value(double %"a::Float64", !19, !DIExpression(), !22)
    #dbg_declare(ptr %"x::Array", !20, !DIExpression(), !22)
    #dbg_declare(ptr %"y::Array", !21, !DIExpression(), !22)
  %ptls_field = getelementptr inbounds nuw i8, ptr %pgcstack_arg, i64 16
  %ptls_load = load ptr, ptr %ptls_field, align 8, !tbaa !23
  %0 = getelementptr inbounds nuw i8, ptr %ptls_load, i64 16
  %safepoint = load ptr, ptr %0, align 8, !tbaa !27, !invariant.load !9
  fence syncscope("singlethread") seq_cst
  %1 = load volatile i64, ptr %safepoint, align 8, !dbg !22
  fence syncscope("singlethread") seq_cst
;  @ REPL[7]:2 within `axpy!`
; ┌ @ simdloop.jl:69 within `macro expansion`
; │┌ @ abstractarray.jl:384 within `eachindex` @ abstractarray.jl:394 @ abstractarray.jl:391
; ││┌ @ abstractarray.jl:137 within `axes1`
; │││┌ @ abstractarray.jl:98 within `axes`
; ││││┌ @ array.jl:194 within `size`
       %"x::Array.size_ptr" = getelementptr inbounds nuw i8, ptr %"x::Array", i64 16, !dbg !29
       %"x::Array.size.0.copyload" = load i64, ptr %"x::Array.size_ptr", align 8, !dbg !29, !tbaa !46, !alias.scope !47, !noalias !51
; ││││└
; ││││┌ @ tuple.jl:358 within `map`
; │││││┌ @ range.jl:486 within `unchecked_oneto`
        store i64 %"x::Array.size.0.copyload", ptr %"new::OneTo", align 8, !dbg !55, !tbaa !61, !alias.scope !63, !noalias !64
; ││└└└└
; ││ @ abstractarray.jl:384 within `eachindex` @ abstractarray.jl:395
; ││┌ @ tuple.jl:358 within `map`
; │││┌ @ abstractarray.jl:395 within `#eachindex##0`
; ││││┌ @ abstractarray.jl:391 within `eachindex`
; │││││┌ @ abstractarray.jl:137 within `axes1`
; ││││││┌ @ abstractarray.jl:98 within `axes`
; │││││││┌ @ array.jl:194 within `size`
          %"y::Array.size_ptr" = getelementptr inbounds nuw i8, ptr %"y::Array", i64 16, !dbg !65
          %"y::Array.size.0.copyload" = load i64, ptr %"y::Array.size_ptr", align 8, !dbg !65, !tbaa !46, !alias.scope !47, !noalias !51
; │││││││└
; │││││││┌ @ tuple.jl:358 within `map`
; ││││││││┌ @ range.jl:486 within `unchecked_oneto`
           store i64 %"y::Array.size.0.copyload", ptr %"new::OneTo1", align 8, !dbg !73, !tbaa !61, !alias.scope !63, !noalias !64
; ││└└└└└└└
; ││ @ abstractarray.jl:384 within `eachindex` @ abstractarray.jl:396
; ││┌ @ anyall.jl:240 within `all`
; │││┌ @ anyall.jl:211 within `_all`
; ││││┌ @ operators.jl:1195 within `Fix`
; │││││┌ @ operators.jl:1195 within `#_#59`
; ││││││┌ @ range.jl:1145 within `==` @ promotion.jl:632
         %.not.not = icmp eq i64 %"y::Array.size.0.copyload", %"x::Array.size.0.copyload", !dbg !75
; ││└└└└└
    br i1 %.not.not, label %L24, label %L21, !dbg !90

L21:                                              ; preds = %top
    call swiftcc void @j_throw_eachindex_mismatch_indices_3607(ptr nonnull swiftself %pgcstack_arg, ptr nonnull @"jl_global#3608.jit", ptr nocapture nonnull readonly %"new::OneTo", ptr nocapture nonnull readonly %"new::OneTo1") #7, !dbg !90
    unreachable, !dbg !90

L24:                                              ; preds = %top
; │└
; │ @ simdloop.jl:72 within `macro expansion`
; │┌ @ int.jl:83 within `<`
    %2 = icmp slt i64 %"x::Array.size.0.copyload", 1, !dbg !91
; │└
   br i1 %2, label %L103, label %iter.check, !dbg !94

iter.check:                                       ; preds = %L24
   %memoryref_data = load ptr, ptr %"x::Array", align 8, !tbaa !95, !alias.scope !98, !noalias !99
   %memoryref_data12 = load ptr, ptr %"y::Array", align 8, !tbaa !95, !alias.scope !98, !noalias !99
; │ @ simdloop.jl:75 within `macro expansion`
   %min.iters.check = icmp eq i64 %"x::Array.size.0.copyload", 1, !dbg !100
   br i1 %min.iters.check, label %L31.preheader, label %vector.memcheck, !dbg !100

vector.memcheck:                                  ; preds = %iter.check
   %3 = shl i64 %"x::Array.size.0.copyload", 3, !dbg !100
   %scevgep = getelementptr i8, ptr %memoryref_data12, i64 %3, !dbg !100
   %scevgep44 = getelementptr i8, ptr %memoryref_data, i64 %3, !dbg !100
   %bound0 = icmp ult ptr %memoryref_data12, %scevgep44, !dbg !100
   %bound1 = icmp ult ptr %memoryref_data, %scevgep, !dbg !100
   %found.conflict = and i1 %bound0, %bound1, !dbg !100
   br i1 %found.conflict, label %L31.preheader, label %vector.main.loop.iter.check, !dbg !100

vector.main.loop.iter.check:                      ; preds = %vector.memcheck
   %min.iters.check45 = icmp samesign ult i64 %"x::Array.size.0.copyload", 8, !dbg !100
   br i1 %min.iters.check45, label %vector.main.loop.iter.check.vec.epilog.ph_crit_edge, label %vector.ph, !dbg !100

vector.main.loop.iter.check.vec.epilog.ph_crit_edge: ; preds = %vector.main.loop.iter.check
   %.pre = insertelement <2 x double> poison, double %"a::Float64", i64 0, !dbg !100
   %.pre1 = shufflevector <2 x double> %.pre, <2 x double> poison, <2 x i32> zeroinitializer, !dbg !100
   br label %vec.epilog.ph, !dbg !100

vector.ph:                                        ; preds = %vector.main.loop.iter.check
   %n.mod.vf = and i64 %"x::Array.size.0.copyload", 6, !dbg !100
   %n.vec = and i64 %"x::Array.size.0.copyload", 9223372036854775800, !dbg !100
   %broadcast.splatinsert = insertelement <2 x double> poison, double %"a::Float64", i64 0, !dbg !100
   %broadcast.splat = shufflevector <2 x double> %broadcast.splatinsert, <2 x double> poison, <2 x i32> zeroinitializer, !dbg !100
   br label %vector.body, !dbg !100

vector.body:                                      ; preds = %vector.body, %vector.ph
; │ @ simdloop.jl:76 within `macro expansion`
; │┌ @ simdloop.jl:54 within `simd_index`
; ││┌ @ int.jl:87 within `+`
     %index = phi i64 [ 0, %vector.ph ], [ %index.next, %vector.body ], !dbg !101
; │└└
; │ @ simdloop.jl:77 within `macro expansion` @ REPL[7]:3
; │┌ @ essentials.jl:953 within `getindex`
    %4 = getelementptr inbounds nuw double, ptr %memoryref_data, i64 %index, !dbg !106
    %5 = getelementptr inbounds nuw i8, ptr %4, i64 16, !dbg !106
    %6 = getelementptr inbounds nuw i8, ptr %4, i64 32, !dbg !106
    %7 = getelementptr inbounds nuw i8, ptr %4, i64 48, !dbg !106
    %wide.load = load <2 x double>, ptr %4, align 8, !dbg !106, !tbaa !112, !alias.scope !115, !noalias !118
    %wide.load46 = load <2 x double>, ptr %5, align 8, !dbg !106, !tbaa !112, !alias.scope !115, !noalias !118
    %wide.load47 = load <2 x double>, ptr %6, align 8, !dbg !106, !tbaa !112, !alias.scope !115, !noalias !118
    %wide.load48 = load <2 x double>, ptr %7, align 8, !dbg !106, !tbaa !112, !alias.scope !115, !noalias !118
    %8 = getelementptr inbounds nuw double, ptr %memoryref_data12, i64 %index, !dbg !106
    %9 = getelementptr inbounds nuw i8, ptr %8, i64 16, !dbg !106
    %10 = getelementptr inbounds nuw i8, ptr %8, i64 32, !dbg !106
    %11 = getelementptr inbounds nuw i8, ptr %8, i64 48, !dbg !106
    %wide.load49 = load <2 x double>, ptr %8, align 8, !dbg !106, !tbaa !112, !alias.scope !119, !noalias !121
    %wide.load50 = load <2 x double>, ptr %9, align 8, !dbg !106, !tbaa !112, !alias.scope !119, !noalias !121
    %wide.load51 = load <2 x double>, ptr %10, align 8, !dbg !106, !tbaa !112, !alias.scope !119, !noalias !121
    %wide.load52 = load <2 x double>, ptr %11, align 8, !dbg !106, !tbaa !112, !alias.scope !119, !noalias !121
; │└
; │┌ @ float.jl:497 within `muladd`
    %12 = fmul contract <2 x double> %broadcast.splat, %wide.load, !dbg !122
    %13 = fmul contract <2 x double> %broadcast.splat, %wide.load46, !dbg !122
    %14 = fmul contract <2 x double> %broadcast.splat, %wide.load47, !dbg !122
    %15 = fmul contract <2 x double> %broadcast.splat, %wide.load48, !dbg !122
    %16 = fadd contract <2 x double> %12, %wide.load49, !dbg !122
    %17 = fadd contract <2 x double> %13, %wide.load50, !dbg !122
    %18 = fadd contract <2 x double> %14, %wide.load51, !dbg !122
    %19 = fadd contract <2 x double> %15, %wide.load52, !dbg !122
; │└
; │┌ @ array.jl:986 within `setindex!`
; ││┌ @ array.jl:991 within `_setindex!`
     store <2 x double> %16, ptr %8, align 8, !dbg !125, !tbaa !112, !alias.scope !119, !noalias !121
     store <2 x double> %17, ptr %9, align 8, !dbg !125, !tbaa !112, !alias.scope !119, !noalias !121
     store <2 x double> %18, ptr %10, align 8, !dbg !125, !tbaa !112, !alias.scope !119, !noalias !121
     store <2 x double> %19, ptr %11, align 8, !dbg !125, !tbaa !112, !alias.scope !119, !noalias !121
; │└└
; │ @ simdloop.jl:76 within `macro expansion`
; │┌ @ simdloop.jl:54 within `simd_index`
; ││┌ @ int.jl:87 within `+`
     %index.next = add nuw i64 %index, 8, !dbg !101
     %20 = icmp eq i64 %index.next, %n.vec, !dbg !101
     br i1 %20, label %middle.block, label %vector.body, !dbg !101, !llvm.loop !129

middle.block:                                     ; preds = %vector.body
; │└└
; │ @ simdloop.jl:75 within `macro expansion`
   %cmp.n = icmp eq i64 %"x::Array.size.0.copyload", %n.vec, !dbg !100
   br i1 %cmp.n, label %L103, label %vec.epilog.iter.check, !dbg !100

vec.epilog.iter.check:                            ; preds = %middle.block
   %min.epilog.iters.check = icmp eq i64 %n.mod.vf, 0, !dbg !100
   br i1 %min.epilog.iters.check, label %L31.preheader, label %vec.epilog.ph, !dbg !100

vec.epilog.ph:                                    ; preds = %vector.main.loop.iter.check.vec.epilog.ph_crit_edge, %vec.epilog.iter.check
   %broadcast.splat59.pre-phi = phi <2 x double> [ %.pre1, %vector.main.loop.iter.check.vec.epilog.ph_crit_edge ], [ %broadcast.splat, %vec.epilog.iter.check ], !dbg !100
   %vec.epilog.resume.val = phi i64 [ 0, %vector.main.loop.iter.check.vec.epilog.ph_crit_edge ], [ %n.vec, %vec.epilog.iter.check ]
   %n.vec54 = and i64 %"x::Array.size.0.copyload", 9223372036854775806, !dbg !100
   br label %vec.epilog.vector.body, !dbg !100

vec.epilog.vector.body:                           ; preds = %vec.epilog.vector.body, %vec.epilog.ph
; │ @ simdloop.jl:76 within `macro expansion`
; │┌ @ simdloop.jl:54 within `simd_index`
; ││┌ @ int.jl:87 within `+`
     %index55 = phi i64 [ %vec.epilog.resume.val, %vec.epilog.ph ], [ %index.next60, %vec.epilog.vector.body ], !dbg !101
; │└└
; │ @ simdloop.jl:77 within `macro expansion` @ REPL[7]:3
; │┌ @ essentials.jl:953 within `getindex`
    %gep = getelementptr double, ptr %memoryref_data, i64 %index55, !dbg !106
    %wide.load56 = load <2 x double>, ptr %gep, align 8, !dbg !106, !tbaa !112, !alias.scope !132, !noalias !118
    %gep63 = getelementptr double, ptr %memoryref_data12, i64 %index55, !dbg !106
    %wide.load57 = load <2 x double>, ptr %gep63, align 8, !dbg !106, !tbaa !112, !alias.scope !135, !noalias !137
; │└
; │┌ @ float.jl:497 within `muladd`
    %21 = fmul contract <2 x double> %broadcast.splat59.pre-phi, %wide.load56, !dbg !122
    %22 = fadd contract <2 x double> %21, %wide.load57, !dbg !122
; │└
; │┌ @ array.jl:986 within `setindex!`
; ││┌ @ array.jl:991 within `_setindex!`
     store <2 x double> %22, ptr %gep63, align 8, !dbg !125, !tbaa !112, !alias.scope !135, !noalias !137
; │└└
; │ @ simdloop.jl:76 within `macro expansion`
; │┌ @ simdloop.jl:54 within `simd_index`
; ││┌ @ int.jl:87 within `+`
     %index.next60 = add nuw i64 %index55, 2, !dbg !101
     %23 = icmp eq i64 %index.next60, %n.vec54, !dbg !101
     br i1 %23, label %vec.epilog.middle.block, label %vec.epilog.vector.body, !dbg !101, !llvm.loop !138

vec.epilog.middle.block:                          ; preds = %vec.epilog.vector.body
; │└└
; │ @ simdloop.jl:75 within `macro expansion`
   %cmp.n61 = icmp eq i64 %"x::Array.size.0.copyload", %n.vec54, !dbg !100
   br i1 %cmp.n61, label %L103, label %L31.preheader, !dbg !100

L31.preheader:                                    ; preds = %vec.epilog.middle.block, %vec.epilog.iter.check, %vector.memcheck, %iter.check
   %value_phi343.ph = phi i64 [ %n.vec, %vec.epilog.iter.check ], [ 0, %iter.check ], [ 0, %vector.memcheck ], [ %n.vec54, %vec.epilog.middle.block ]
   %.neg = or disjoint i64 %value_phi343.ph, 1, !dbg !100
   %xtraiter = and i64 %"x::Array.size.0.copyload", 1, !dbg !100
   %lcmp.mod.not = icmp eq i64 %xtraiter, 0, !dbg !100
   br i1 %lcmp.mod.not, label %L31.prol.loopexit, label %L31.prol.preheader, !dbg !100

L31.prol.preheader:                               ; preds = %L31.preheader
; │ @ simdloop.jl:77 within `macro expansion` @ REPL[7]:3
; │┌ @ essentials.jl:953 within `getindex`
    %memoryref_data8.prol = getelementptr inbounds nuw double, ptr %memoryref_data, i64 %value_phi343.ph, !dbg !106
    %24 = load double, ptr %memoryref_data8.prol, align 8, !dbg !106, !tbaa !112, !alias.scope !139, !noalias !118
    %memoryref_data19.prol = getelementptr inbounds nuw double, ptr %memoryref_data12, i64 %value_phi343.ph, !dbg !106
    %25 = load double, ptr %memoryref_data19.prol, align 8, !dbg !106, !tbaa !112, !alias.scope !139, !noalias !118
; │└
; │┌ @ float.jl:497 within `muladd`
    %26 = fmul contract double %"a::Float64", %24, !dbg !122
    %27 = fadd contract double %26, %25, !dbg !122
; │└
; │┌ @ array.jl:986 within `setindex!`
; ││┌ @ array.jl:991 within `_setindex!`
     store double %27, ptr %memoryref_data19.prol, align 8, !dbg !125, !tbaa !112, !alias.scope !139, !noalias !118
; │└└
; │ @ simdloop.jl:75 within `macro expansion`
   br label %L31.prol.loopexit, !dbg !100

L31.prol.loopexit:                                ; preds = %L31.prol.preheader, %L31.preheader
   %value_phi343.unr = phi i64 [ %value_phi343.ph, %L31.preheader ], [ %.neg, %L31.prol.preheader ]
   %28 = icmp eq i64 %"x::Array.size.0.copyload", %.neg, !dbg !100
   br i1 %28, label %L103, label %L31, !dbg !100

L31:                                              ; preds = %L31.prol.loopexit, %L31
   %value_phi343 = phi i64 [ %34, %L31 ], [ %value_phi343.unr, %L31.prol.loopexit ]
; │ @ simdloop.jl:76 within `macro expansion`
; │┌ @ simdloop.jl:54 within `simd_index`
; ││┌ @ int.jl:87 within `+`
     %29 = add nuw nsw i64 %value_phi343, 1, !dbg !101
; │└└
; │ @ simdloop.jl:77 within `macro expansion` @ REPL[7]:3
; │┌ @ essentials.jl:953 within `getindex`
    %memoryref_data8 = getelementptr inbounds nuw double, ptr %memoryref_data, i64 %value_phi343, !dbg !106
    %30 = load double, ptr %memoryref_data8, align 8, !dbg !106, !tbaa !112, !alias.scope !139, !noalias !118
    %memoryref_data19 = getelementptr inbounds nuw double, ptr %memoryref_data12, i64 %value_phi343, !dbg !106
    %31 = load double, ptr %memoryref_data19, align 8, !dbg !106, !tbaa !112, !alias.scope !139, !noalias !118
; │└
; │┌ @ float.jl:497 within `muladd`
    %32 = fmul contract double %"a::Float64", %30, !dbg !122
    %33 = fadd contract double %32, %31, !dbg !122
; │└
; │┌ @ array.jl:986 within `setindex!`
; ││┌ @ array.jl:991 within `_setindex!`
     store double %33, ptr %memoryref_data19, align 8, !dbg !125, !tbaa !112, !alias.scope !139, !noalias !118
; │└└
; │ @ simdloop.jl:76 within `macro expansion`
; │┌ @ simdloop.jl:54 within `simd_index`
; ││┌ @ int.jl:87 within `+`
     %34 = add nuw nsw i64 %value_phi343, 2, !dbg !101
; │└└
; │ @ simdloop.jl:77 within `macro expansion` @ REPL[7]:3
; │┌ @ essentials.jl:953 within `getindex`
    %memoryref_data8.1 = getelementptr inbounds nuw double, ptr %memoryref_data, i64 %29, !dbg !106
    %35 = load double, ptr %memoryref_data8.1, align 8, !dbg !106, !tbaa !112, !alias.scope !139, !noalias !118
    %memoryref_data19.1 = getelementptr inbounds nuw double, ptr %memoryref_data12, i64 %29, !dbg !106
    %36 = load double, ptr %memoryref_data19.1, align 8, !dbg !106, !tbaa !112, !alias.scope !139, !noalias !118
; │└
; │┌ @ float.jl:497 within `muladd`
    %37 = fmul contract double %"a::Float64", %35, !dbg !122
    %38 = fadd contract double %37, %36, !dbg !122
; │└
; │┌ @ array.jl:986 within `setindex!`
; ││┌ @ array.jl:991 within `_setindex!`
     store double %38, ptr %memoryref_data19.1, align 8, !dbg !125, !tbaa !112, !alias.scope !139, !noalias !118
; │└└
; │ @ simdloop.jl:75 within `macro expansion`
; │┌ @ int.jl:83 within `<`
    %exitcond.not.1 = icmp eq i64 %34, %"x::Array.size.0.copyload", !dbg !140
; │└
   br i1 %exitcond.not.1, label %L103, label %L31, !dbg !100, !llvm.loop !141

L103:                                             ; preds = %L31.prol.loopexit, %L31, %vec.epilog.middle.block, %middle.block, %L24
; │ @ simdloop.jl:76 within `macro expansion`
; │┌ @ simdloop.jl:54 within `simd_index`
; ││┌ @ array.jl:3149 within `getindex`
; │││┌ @ range.jl:942 within `_getindex`
; ││││┌ @ abstractarray.jl:697 within `checkbounds`
       ret void, !dbg !142
; └└└└└
}

; Function Attrs: noinline optnone
define nonnull ptr @"jfptr_axpy!_3604"(ptr %"function::Core.Function", ptr noalias nocapture noundef readonly %"args::Any[]", i32 %"nargs::UInt32") local_unnamed_addr #1 {
top:
  %thread_ptr = call ptr asm "mrs $0, tpidr_el0", "=r"()
  %tls_ppgcstack = getelementptr inbounds i8, ptr %thread_ptr, i64 16
  %tls_pgcstack = load ptr, ptr %tls_ppgcstack, align 8
  %0 = getelementptr inbounds i8, ptr %"args::Any[]", i32 0
  %1 = load ptr, ptr %0, align 8, !tbaa !27, !invariant.load !9, !alias.scope !148, !noalias !149, !nonnull !9, !dereferenceable !150, !align !150
  %2 = getelementptr inbounds i8, ptr %"args::Any[]", i32 8
  %3 = load ptr, ptr %2, align 8, !tbaa !27, !invariant.load !9, !alias.scope !148, !noalias !149, !nonnull !9, !dereferenceable !151, !align !150
  %4 = getelementptr inbounds i8, ptr %"args::Any[]", i32 16
  %5 = load ptr, ptr %4, align 8, !tbaa !27, !invariant.load !9, !alias.scope !148, !noalias !149, !nonnull !9, !dereferenceable !151, !align !150
  %.unbox = load double, ptr %1, align 8, !tbaa !152, !alias.scope !139, !noalias !118
  call swiftcc void @"julia_axpy!_3603"(ptr nonnull swiftself %tls_pgcstack, double %.unbox, ptr %3, ptr %5)
  %jl_nothing = load ptr, ptr @jl_nothing, align 8, !tbaa !27, !invariant.load !9, !alias.scope !148, !noalias !149, !nonnull !9
  ret ptr %jl_nothing
}

; Function Attrs: memory(argmem: readwrite, inaccessiblemem: readwrite)
declare void @julia.safepoint(ptr) local_unnamed_addr #2

; Function Attrs: mustprogress nofree norecurse nosync nounwind speculatable willreturn memory(none)
declare noundef nonnull ptr @julia.gc_loaded(ptr nocapture noundef nonnull readnone, ptr noundef nonnull readnone) local_unnamed_addr #3

; Function Signature: throw_eachindex_mismatch_indices(String, Base.OneTo{Int64}, Base.OneTo{Int64})
; Function Attrs: noreturn
declare swiftcc void @j_throw_eachindex_mismatch_indices_3607(ptr nonnull swiftself, ptr, ptr nocapture readonly, ptr nocapture readonly) local_unnamed_addr #4

; Function Attrs: nounwind willreturn allockind("alloc") allocsize(2) memory(argmem: read, inaccessiblemem: readwrite)
declare noalias nonnull ptr @ijl_gc_small_alloc(ptr, i32, i32, i64) #5

; Function Attrs: memory(argmem: readwrite, inaccessiblemem: readwrite)
declare void @ijl_gc_queue_root(ptr) #2

; Function Attrs: nounwind willreturn allockind("alloc") allocsize(1) memory(argmem: read, inaccessiblemem: readwrite)
declare noalias nonnull ptr @ijl_gc_big_alloc(ptr, i64, i64) #6

; Function Attrs: nounwind willreturn allockind("alloc") allocsize(1) memory(argmem: read, inaccessiblemem: readwrite)
declare noalias nonnull ptr @ijl_gc_alloc_typed(ptr, i64, i64) #6

attributes #0 = { "frame-pointer"="all" "julia.fsig"="axpy!(Float64, Array{Float64, 1}, Array{Float64, 1})" "probe-stack"="inline-asm" }
attributes #1 = { noinline optnone "frame-pointer"="all" "probe-stack"="inline-asm" }
attributes #2 = { memory(argmem: readwrite, inaccessiblemem: readwrite) }
attributes #3 = { mustprogress nofree norecurse nosync nounwind speculatable willreturn memory(none) }
attributes #4 = { noreturn "frame-pointer"="all" "julia.fsig"="throw_eachindex_mismatch_indices(String, Base.OneTo{Int64}, Base.OneTo{Int64})" "probe-stack"="inline-asm" }
attributes #5 = { nounwind willreturn allockind("alloc") allocsize(2) memory(argmem: read, inaccessiblemem: readwrite) }
attributes #6 = { nounwind willreturn allockind("alloc") allocsize(1) memory(argmem: read, inaccessiblemem: readwrite) }
attributes #7 = { noreturn }

!llvm.module.flags = !{!0, !1}
!llvm.dbg.cu = !{!2}

!0 = !{i32 2, !"Dwarf Version", i32 4}
!1 = !{i32 2, !"Debug Info Version", i32 3}
!2 = distinct !DICompileUnit(language: DW_LANG_Julia, file: !3, producer: "julia", isOptimized: true, runtimeVersion: 0, emissionKind: NoDebug, nameTableKind: GNU)
!3 = !DIFile(filename: "julia", directory: ".")
!4 = distinct !DISubprogram(name: "axpy!", linkageName: "julia_axpy!_3603", scope: null, file: !5, line: 1, type: !6, scopeLine: 1, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !2, retainedNodes: !17)
!5 = !DIFile(filename: "REPL[7]", directory: ".")
!6 = !DISubroutineType(types: !7)
!7 = !{!8, !10, !11, !12, !12}
!8 = !DICompositeType(tag: DW_TAG_structure_type, name: "Nothing", align: 8, elements: !9, runtimeLang: DW_LANG_Julia, identifier: "281472950670688")
!9 = !{}
!10 = !DICompositeType(tag: DW_TAG_structure_type, name: "#axpy!", align: 8, elements: !9, runtimeLang: DW_LANG_Julia, identifier: "281472803406096")
!11 = !DIBasicType(name: "Float64", size: 64, encoding: DW_ATE_unsigned)
!12 = !DIDerivedType(tag: DW_TAG_typedef, name: "Array", baseType: !13)
!13 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !14, size: 64, align: 64)
!14 = !DICompositeType(tag: DW_TAG_structure_type, name: "jl_value_t", file: !15, line: 71, align: 64, elements: !16)
!15 = !DIFile(filename: "julia.h", directory: "")
!16 = !{!13}
!17 = !{!18, !19, !20, !21}
!18 = !DILocalVariable(name: "#self#", arg: 1, scope: !4, file: !5, line: 1, type: !10)
!19 = !DILocalVariable(name: "a", arg: 2, scope: !4, file: !5, line: 1, type: !11)
!20 = !DILocalVariable(name: "x", arg: 3, scope: !4, file: !5, line: 1, type: !12)
!21 = !DILocalVariable(name: "y", arg: 4, scope: !4, file: !5, line: 1, type: !12)
!22 = !DILocation(line: 1, scope: !4)
!23 = !{!24, !24, i64 0}
!24 = !{!"jtbaa_gcframe", !25, i64 0}
!25 = !{!"jtbaa", !26, i64 0}
!26 = !{!"jtbaa"}
!27 = !{!28, !28, i64 0}
!28 = !{!"jtbaa_const", !25, i64 0}
!29 = !DILocation(line: 194, scope: !30, inlinedAt: !33)
!30 = distinct !DISubprogram(name: "size;", linkageName: "size", scope: !31, file: !31, type: !32, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !2)
!31 = !DIFile(filename: "array.jl", directory: ".")
!32 = !DISubroutineType(types: !9)
!33 = !DILocation(line: 98, scope: !34, inlinedAt: !36)
!34 = distinct !DISubprogram(name: "axes;", linkageName: "axes", scope: !35, file: !35, type: !32, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !2)
!35 = !DIFile(filename: "abstractarray.jl", directory: ".")
!36 = !DILocation(line: 137, scope: !37, inlinedAt: !38)
!37 = distinct !DISubprogram(name: "axes1;", linkageName: "axes1", scope: !35, file: !35, type: !32, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !2)
!38 = !DILocation(line: 391, scope: !39, inlinedAt: !40)
!39 = distinct !DISubprogram(name: "eachindex;", linkageName: "eachindex", scope: !35, file: !35, type: !32, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !2)
!40 = !DILocation(line: 394, scope: !39, inlinedAt: !41)
!41 = !DILocation(line: 384, scope: !39, inlinedAt: !42)
!42 = !DILocation(line: 69, scope: !43, inlinedAt: !45)
!43 = distinct !DISubprogram(name: "macro expansion;", linkageName: "macro expansion", scope: !44, file: !44, type: !32, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !2)
!44 = !DIFile(filename: "simdloop.jl", directory: ".")
!45 = !DILocation(line: 2, scope: !4)
!46 = !{!25, !25, i64 0}
!47 = !{!48, !50}
!48 = !{!"jnoalias_typemd", !49}
!49 = !{!"jnoalias"}
!50 = !{!"jnoalias_stack", !49}
!51 = !{!52, !53, !54}
!52 = !{!"jnoalias_gcframe", !49}
!53 = !{!"jnoalias_data", !49}
!54 = !{!"jnoalias_const", !49}
!55 = !DILocation(line: 486, scope: !56, inlinedAt: !58)
!56 = distinct !DISubprogram(name: "unchecked_oneto;", linkageName: "unchecked_oneto", scope: !57, file: !57, type: !32, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !2)
!57 = !DIFile(filename: "range.jl", directory: ".")
!58 = !DILocation(line: 358, scope: !59, inlinedAt: !33)
!59 = distinct !DISubprogram(name: "map;", linkageName: "map", scope: !60, file: !60, type: !32, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !2)
!60 = !DIFile(filename: "tuple.jl", directory: ".")
!61 = !{!62, !62, i64 0}
!62 = !{!"jtbaa_stack", !25, i64 0}
!63 = !{!50}
!64 = !{!52, !53, !48, !54}
!65 = !DILocation(line: 194, scope: !30, inlinedAt: !66)
!66 = !DILocation(line: 98, scope: !34, inlinedAt: !67)
!67 = !DILocation(line: 137, scope: !37, inlinedAt: !68)
!68 = !DILocation(line: 391, scope: !39, inlinedAt: !69)
!69 = !DILocation(line: 395, scope: !70, inlinedAt: !71)
!70 = distinct !DISubprogram(name: "#eachindex##0;", linkageName: "#eachindex##0", scope: !35, file: !35, type: !32, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !2)
!71 = !DILocation(line: 358, scope: !59, inlinedAt: !72)
!72 = !DILocation(line: 395, scope: !39, inlinedAt: !41)
!73 = !DILocation(line: 486, scope: !56, inlinedAt: !74)
!74 = !DILocation(line: 358, scope: !59, inlinedAt: !66)
!75 = !DILocation(line: 632, scope: !76, inlinedAt: !78)
!76 = distinct !DISubprogram(name: "==;", linkageName: "==", scope: !77, file: !77, type: !32, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !2)
!77 = !DIFile(filename: "promotion.jl", directory: ".")
!78 = !DILocation(line: 1145, scope: !79, inlinedAt: !80)
!79 = distinct !DISubprogram(name: "==;", linkageName: "==", scope: !57, file: !57, type: !32, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !2)
!80 = !DILocation(line: 1195, scope: !81, inlinedAt: !83)
!81 = distinct !DISubprogram(name: "#_#59;", linkageName: "#_#59", scope: !82, file: !82, type: !32, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !2)
!82 = !DIFile(filename: "operators.jl", directory: ".")
!83 = !DILocation(line: 1195, scope: !84, inlinedAt: !85)
!84 = distinct !DISubprogram(name: "Fix;", linkageName: "Fix", scope: !82, file: !82, type: !32, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !2)
!85 = !DILocation(line: 211, scope: !86, inlinedAt: !88)
!86 = distinct !DISubprogram(name: "_all;", linkageName: "_all", scope: !87, file: !87, type: !32, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !2)
!87 = !DIFile(filename: "anyall.jl", directory: ".")
!88 = !DILocation(line: 240, scope: !89, inlinedAt: !90)
!89 = distinct !DISubprogram(name: "all;", linkageName: "all", scope: !87, file: !87, type: !32, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !2)
!90 = !DILocation(line: 396, scope: !39, inlinedAt: !41)
!91 = !DILocation(line: 83, scope: !92, inlinedAt: !94)
!92 = distinct !DISubprogram(name: "<;", linkageName: "<", scope: !93, file: !93, type: !32, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !2)
!93 = !DIFile(filename: "int.jl", directory: ".")
!94 = !DILocation(line: 72, scope: !43, inlinedAt: !45)
!95 = !{!96, !96, i64 0}
!96 = !{!"jtbaa_arrayptr", !97, i64 0}
!97 = !{!"jtbaa_array", !25, i64 0}
!98 = !{!48}
!99 = !{!52, !50, !53, !54}
!100 = !DILocation(line: 75, scope: !43, inlinedAt: !45)
!101 = !DILocation(line: 87, scope: !102, inlinedAt: !103)
!102 = distinct !DISubprogram(name: "+;", linkageName: "+", scope: !93, file: !93, type: !32, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !2)
!103 = !DILocation(line: 54, scope: !104, inlinedAt: !105)
!104 = distinct !DISubprogram(name: "simd_index;", linkageName: "simd_index", scope: !44, file: !44, type: !32, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !2)
!105 = !DILocation(line: 76, scope: !43, inlinedAt: !45)
!106 = !DILocation(line: 953, scope: !107, inlinedAt: !109)
!107 = distinct !DISubprogram(name: "getindex;", linkageName: "getindex", scope: !108, file: !108, type: !32, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !2)
!108 = !DIFile(filename: "essentials.jl", directory: ".")
!109 = !DILocation(line: 3, scope: !110, inlinedAt: !111)
!110 = distinct !DISubprogram(name: "macro expansion;", linkageName: "macro expansion", scope: !5, file: !5, type: !32, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !2)
!111 = !DILocation(line: 77, scope: !43, inlinedAt: !45)
!112 = !{!113, !113, i64 0}
!113 = !{!"jtbaa_arraybuf", !114, i64 0}
!114 = !{!"jtbaa_data", !25, i64 0}
!115 = !{!53, !116}
!116 = distinct !{!116, !117}
!117 = distinct !{!117, !"LVerDomain"}
!118 = !{!52, !50, !48, !54}
!119 = !{!53, !120}
!120 = distinct !{!120, !117}
!121 = !{!52, !50, !48, !54, !116}
!122 = !DILocation(line: 497, scope: !123, inlinedAt: !109)
!123 = distinct !DISubprogram(name: "muladd;", linkageName: "muladd", scope: !124, file: !124, type: !32, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !2)
!124 = !DIFile(filename: "float.jl", directory: ".")
!125 = !DILocation(line: 991, scope: !126, inlinedAt: !127)
!126 = distinct !DISubprogram(name: "_setindex!;", linkageName: "_setindex!", scope: !31, file: !31, type: !32, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !2)
!127 = !DILocation(line: 986, scope: !128, inlinedAt: !109)
!128 = distinct !DISubprogram(name: "setindex!;", linkageName: "setindex!", scope: !31, file: !31, type: !32, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !2)
!129 = distinct !{!129, !130, !131}
!130 = !{!"llvm.loop.isvectorized", i32 1}
!131 = !{!"llvm.loop.unroll.runtime.disable"}
!132 = !{!53, !133}
!133 = distinct !{!133, !134}
!134 = distinct !{!134, !"LVerDomain"}
!135 = !{!53, !136}
!136 = distinct !{!136, !134}
!137 = !{!52, !50, !48, !54, !133}
!138 = distinct !{!138, !130, !131}
!139 = !{!53}
!140 = !DILocation(line: 83, scope: !92, inlinedAt: !100)
!141 = distinct !{!141, !130}
!142 = !DILocation(line: 697, scope: !143, inlinedAt: !144)
!143 = distinct !DISubprogram(name: "checkbounds;", linkageName: "checkbounds", scope: !35, file: !35, type: !32, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !2)
!144 = !DILocation(line: 942, scope: !145, inlinedAt: !146)
!145 = distinct !DISubprogram(name: "_getindex;", linkageName: "_getindex", scope: !57, file: !57, type: !32, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !2)
!146 = !DILocation(line: 3149, scope: !147, inlinedAt: !103)
!147 = distinct !DISubprogram(name: "getindex;", linkageName: "getindex", scope: !31, file: !31, type: !32, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !2)
!148 = !{!54}
!149 = !{!52, !50, !53, !48}
!150 = !{i64 8}
!151 = !{i64 24}
!152 = !{!153, !153, i64 0}
!153 = !{!"jtbaa_immut", !154, i64 0}
!154 = !{!"jtbaa_value", !114, i64 0}

Still no vscale.

@gbaraldi
Copy link
Member Author

The attributte didn't get added to the function :\

SmallVector<StringRef, 128> Features;
FS.split(Features, ',');
for (StringRef Feature : Features) {
if (Feature == "sve")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (Feature == "sve")
if (Feature == "+sve")

With this the function gets the attribute vscale_range

; Function Attrs: vscale_range(1,16)
define swiftcc void @"julia_axpy!_763"(ptr nonnull swiftself %pgcstack_arg, double %"a::Float64", ptr noundef nonnull align 8 dereferenceable(24) %"x::Array", ptr noundef nonnull align 8 dereferenceable(24) %"y::Array") local_unnamed_addr #0 !dbg !4 {
; [...]
attributes #0 = { vscale_range(1,16) "frame-pointer"="all" "julia.fsig"="axpy!(Float64, Array{Float64, 1}, Array{Float64, 1})" "probe-stack"="inline-asm" }
attributes #1 = { noinline optnone vscale_range(1,16) "frame-pointer"="all" "probe-stack"="inline-asm" }
attributes #2 = { memory(argmem: readwrite, inaccessiblemem: readwrite) vscale_range(1,16) }
attributes #3 = { mustprogress nofree norecurse nosync nounwind speculatable willreturn memory(none) vscale_range(1,16) }
attributes #4 = { noreturn vscale_range(1,16) "frame-pointer"="all" "julia.fsig"="throw_eachindex_mismatch_indices(String, Base.OneTo{Int64}, Base.OneTo{Int64})" "probe-stack"="inline-asm" }
attributes #5 = { nounwind willreturn allockind("alloc") allocsize(2) memory(argmem: read, inaccessiblemem: readwrite) }
attributes #6 = { memory(argmem: readwrite, inaccessiblemem: readwrite) }
attributes #7 = { nounwind willreturn allockind("alloc") allocsize(1) memory(argmem: read, inaccessiblemem: readwrite) }
attributes #8 = { noreturn }

but there's still not vscale in the vector body.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants