Skip to content

Conversation

@yonghong-song
Copy link
Contributor

@yonghong-song yonghong-song commented Oct 27, 2025

Add a new pass EmitChangedFuncDebugInfo which will add dwarf for
additional functions whose signatures are changed during compiler
transformations.

The original intention is for bpf-based linux kernel tracing.
The function signature is available in vmlinux BTF generated
from pahole/dwarf. Such signature is generated from dwarf
at the source level. But this is not ideal since some function
may have signatures changed. If user still used the source
level signature, users may not get correct results and may
need some efforts to workaround the issue.

So we want to encode the true signature (not different
from the source one) in dwarf. With such additional information,
dwarf users can get these signature changed functions.
For example, pahole is able to process these signature
changed functions and encode them into vmlinux BTF properly.

History of multiple attempts

Previously I have attempted a few tries ([1], [2] and [3]).
Initially I tried to modify debuginfo in passes like
ArgPromotion and DeadArgElim, but later on it is suggested
to have a central place to handle new signatures ([1]).

Later, I have another version of patch similar to this
one, but the recommendation is to modify debuginfo to
encode new signature within the same function,
either through inlinedAt or new signature overwriting
the old one. This seems working but it has some
side effect on lldb, some lldb output (e.g. back trace)
will be different from the previous one. The recommendation
is to avoid any behavior change for lldb ([2] and [3]).

So now, I came back to the solution discussed at the
end of [1]. Basically a special dwarf entry will be generated
to encode the new signature. The new signature will have
a reference to the old source-level signature.
So the tool can inspect dwarf to retrieve the related
info.

Examples and dwarf output

In below, a few examples will show how changed signatures
represented in dwarf:

Example 1

Source:

  $ cat test.c
  struct t { int a; };
  char *tar(struct t *a, struct t *d);
  __attribute__((noinline)) static char * foo(struct t *a, int b, struct t *d)
  {
    return tar(a, d);
  }
  char *bar(struct t *a, struct t *d)
  {
    return foo(a, 1, d);
  }

Compiled and dump dwarf with:

  $ clang -O2 -c -g test.c -mllvm -enable-changed-func-dbinfo
  $ llvm-dwarfdump test.o
  0x0000000c: DW_TAG_compile_unit
                ...
  0x0000005c:   DW_TAG_subprogram
                  DW_AT_low_pc    (0x0000000000000010)
                  DW_AT_high_pc   (0x0000000000000015)
                  DW_AT_frame_base        (DW_OP_reg7 RSP)
                  DW_AT_call_all_calls    (true)
                  DW_AT_name      ("foo")
                  DW_AT_decl_file ("/home/yhs/tests/sig-change/deadarg/test.c")
                  DW_AT_decl_line (3)
                  DW_AT_prototyped        (true)
                  DW_AT_calling_convention        (DW_CC_nocall)
                  DW_AT_type      (0x000000b1 "char *")

  0x0000006c:     DW_TAG_formal_parameter
                    DW_AT_location        (DW_OP_reg5 RDI)
                    DW_AT_name    ("a")
                    DW_AT_decl_file       ("/home/yhs/tests/sig-change/deadarg/test.c")
                    DW_AT_decl_line       (3)
                    DW_AT_type    (0x000000ba "t *")

  0x00000076:     DW_TAG_formal_parameter
                    DW_AT_name    ("b")
                    DW_AT_decl_file       ("/home/yhs/tests/sig-change/deadarg/test.c")
                    DW_AT_decl_line       (3)
                    DW_AT_type    (0x000000ce "int")

  0x0000007e:     DW_TAG_formal_parameter
                    DW_AT_location        (DW_OP_reg4 RSI)
                    DW_AT_name    ("d")
                    DW_AT_decl_file       ("/home/yhs/tests/sig-change/deadarg/test.c")
                    DW_AT_decl_line       (3)
                    DW_AT_type    (0x000000ba "t *")

  0x00000088:     DW_TAG_call_site
                    ...

  0x0000009d:     NULL
                  ...
  0x000000d2:   DW_TAG_inlined_subroutine
                  DW_AT_name      ("foo")
                  DW_AT_type      (0x000000b1 "char *")
                  DW_AT_artificial        (true)
                  DW_AT_specification     (0x0000005c "foo")

  0x000000dc:     DW_TAG_formal_parameter
                    DW_AT_name    ("a")
                    DW_AT_type    (0x000000ba "t *")

  0x000000e2:     DW_TAG_formal_parameter
                    DW_AT_name    ("d")
                    DW_AT_type    (0x000000ba "t *")

  0x000000e8:     NULL

In the above, the DISubprogram 'foo' has the original signature but
since parameter 'b' does not have DW_AT_location, it is clear that
parameter will not be used. The actual function signature is represented
in DW_TAG_inlined_subroutine.

For the above case, it looks like DW_TAG_inlined_subroutine is not
necessary. Let us try a few other examples below.

Example 2

Source:

  $ cat test.c
  struct t { long a; long b;};
  __attribute__((noinline)) static long foo(struct t arg) {
    return arg.b * 5;
  }
  long bar(struct t arg) {
    return foo(arg);
  }

Compiled and dump dwarf with:

  $ clang -O2 -c -g test.c -mllvm -enable-changed-func-dbinfo
  $ llvm-dwarfdump test.o
  ...
  0x0000004e:   DW_TAG_subprogram
                  DW_AT_low_pc    (0x0000000000000010)
                  DW_AT_high_pc   (0x0000000000000015)
                  DW_AT_frame_base        (DW_OP_reg7 RSP)
                  DW_AT_call_all_calls    (true)
                  DW_AT_name      ("foo")
                  DW_AT_decl_file ("/home/yhs/tests/sig-change/struct/test.c")
                  DW_AT_decl_line (2)
                  DW_AT_prototyped        (true)
                  DW_AT_calling_convention        (DW_CC_nocall)
                  DW_AT_type      (0x0000006d "long")

  0x0000005e:     DW_TAG_formal_parameter
                    DW_AT_location        (DW_OP_piece 0x8, DW_OP_reg5 RDI, DW_OP_piece 0x8)
                    DW_AT_name    ("arg")
                    DW_AT_decl_file       ("/home/yhs/tests/sig-change/struct/test.c")
                    DW_AT_decl_line       (2)
                    DW_AT_type    (0x00000099 "t")

  0x0000006c:     NULL
  ...
  0x00000088:   DW_TAG_inlined_subroutine
                  DW_AT_name      ("foo")
                  DW_AT_type      (0x0000006d "long")
                  DW_AT_artificial        (true)
                  DW_AT_specification     (0x0000004e "foo")

  0x00000092:     DW_TAG_formal_parameter
                    DW_AT_name    ("arg")
                    DW_AT_type    (0x0000006d "long")

  0x00000098:     NULL

In the above case for function foo(), the original argument is 'struct t',
but the final actual argument is a 'long' type. DW_TAG_inlined_subroutine
can clearly represent the signature type instead of doing DW_AT_location
thing.

There is a problem in the above then, it is not clear what formal parameter
'arg' corresponds to the original parameter. If necessary, the compiler
could change 'arg' to e.g. 'arg_offset_8' to indicate it is 8 byte offset from
the original struct.

Example 3

Source:

  $ cat test2.c
  struct t { long a; long b; long c;};
  __attribute__((noinline)) long foo(struct t arg) {
    return arg.a * arg.c;
  }
  long bar(struct t arg) {
    return foo(arg);
  }

Compiled and dump dwarf with:

  $ clang -O2 -c -g test2.c -mllvm -enable-changed-func-dbinfo
  $ llvm-dwarfdump test2.o
  ...
  0x0000003e:   DW_TAG_subprogram
                  DW_AT_low_pc    (0x0000000000000010)
                  DW_AT_high_pc   (0x0000000000000015)
                  DW_AT_frame_base        (DW_OP_reg7 RSP)
                  DW_AT_call_all_calls    (true)
                  DW_AT_name      ("bar")
                  DW_AT_decl_file ("/home/yhs/tests/sig-change/struct/test2.c")
                  DW_AT_decl_line (5)
                  DW_AT_prototyped        (true)
                  DW_AT_type      (0x0000005f "long")
                  DW_AT_external  (true)

  0x0000004d:     DW_TAG_formal_parameter
                    DW_AT_location        (DW_OP_fbreg +8)
                    DW_AT_name    ("arg")
                    DW_AT_decl_file       ("/home/yhs/tests/sig-change/struct/test2.c")
                    DW_AT_decl_line       (5)
                    DW_AT_type    (0x00000079 "t")

  0x00000058:     DW_TAG_call_site
                    DW_AT_call_origin     (0x00000023 "foo")
                    DW_AT_call_tail_call  (true)
                    DW_AT_call_pc (0x0000000000000010)

  0x0000005e:     NULL
                ...
  0x00000063:   DW_TAG_inlined_subroutine
                  DW_AT_name      ("foo")
                  DW_AT_type      (0x0000005f "long")
                  DW_AT_artificial        (true)
                  DW_AT_specification     (0x00000023 "foo")

  0x0000006d:     DW_TAG_formal_parameter
                    DW_AT_name    ("arg")
                    DW_AT_type    (0x00000074 "t *")

  0x00000073:     NULL

In the above example, from DW_TAG_subprogram, it is not clear what kind
of type the parameter should be. But DW_TAG_inlined_subroutine can
clearly show what the type should be. Again, the name can be changed
e.g. 'arg_ptr' if desired.

Example 4

Source:

  $ cat test.c
  __attribute__((noinline)) static int callee(const int *p) { return *p + 42; }
  int caller(void) {
    int x = 100;
    return callee(&x);
  }

Compiled and dump dwarf with:

  $ clang -O3 -c -g test2.c -mllvm -enable-changed-func-dbinfo
  $ llvm-dwarfdump test2.o
  ...
  0x0000004a:   DW_TAG_subprogram
                  DW_AT_low_pc    (0x0000000000000010)
                  DW_AT_high_pc   (0x0000000000000014)
                  DW_AT_frame_base        (DW_OP_reg7 RSP)
                  DW_AT_call_all_calls    (true)
                  DW_AT_name      ("callee")
                  DW_AT_decl_file ("/home/yhs/tests/sig-change/prom/test.c")
                  DW_AT_decl_line (1)
                  DW_AT_prototyped        (true)
                  DW_AT_calling_convention        (DW_CC_nocall)
                  DW_AT_type      (0x00000063 "int")

  0x0000005a:     DW_TAG_formal_parameter
                    DW_AT_name    ("p")
                    DW_AT_decl_file       ("/home/yhs/tests/sig-change/prom/test.c")
                    DW_AT_decl_line       (1)
                    DW_AT_type    (0x00000078 "const int *")

  0x00000062:     NULL
                ...
  0x00000067:   DW_TAG_inlined_subroutine
                  DW_AT_name      ("callee")
                  DW_AT_type      (0x00000063 "int")
                  DW_AT_artificial        (true)
                  DW_AT_specification     (0x0000004a "callee")

  0x00000071:     DW_TAG_formal_parameter
                    DW_AT_name    ("__0")
                    DW_AT_type    (0x00000063 "int")

  0x00000077:     NULL

In the above, the function

  static int callee(const int *p) { return *p + 42; }

is transformed to

  static int callee(int p) { return p + 42; }

But the new signature is not reflected in DW_TAG_subprogram.
The DW_TAG_inlined_subroutine can precisely capture the
signature. Note that the parameter name is "__0" and "0" means
the first argument. The reason is due to the following IR:

  define internal ... i32 @callee(i32 %0) unnamed_addr #1 !dbg !23 {
      #dbg_value(ptr poison, !29, !DIExpression(), !30)
    %2 = add nsw i32 %0, 42, !dbg !31
    ret i32 %2, !dbg !32
  }
  ...
  !29 = !DILocalVariable(name: "p", arg: 1, scope: !23, file: !1, line: 1, type: !26)

The reason is due to 'ptr poison' as 'ptr poison' mean the debug
value should not be used any more. This is also the reason that
the above DW_TAG_subprogram does not have location information.
DW_TAG_inlined_subroutine can provide correct signature though.

If we compile like below:

  clang -O3 -c -g test.c -fno-discard-value-names -mllvm -enable-changed-func-dbinfo

The function argument name will be preserved

  ... i32 @callee(i32 %p.0.val) ...

and in such cases,
the DW_TAG_inlined_subroutine looks like below:

  0x00000067:   DW_TAG_inlined_subroutine
                  DW_AT_name      ("callee")
                  DW_AT_type      (0x00000063 "int")
                  DW_AT_artificial        (true)
                  DW_AT_specification     (0x0000004a "callee")

  0x00000071:     DW_TAG_formal_parameter
                    DW_AT_name    ("p__0__val")
                    DW_AT_type    (0x00000063 "int")

  0x00000077:     NULL

Note that the original argument name replaces '.' with "__"
so argument name has proper C standard.

Non-LTO vs. LTO

For thin-lto mode, we often see kernel symbols like

  p9_req_cache.llvm.13472271643223911678

If this symbol has identical source level signature with p9_req_cache,
then a special DW_TAG_inlined_subroutine will not be generated.

But if a symbol with ".llvm." has different signatures
than the source level "", then a special DW_TAG_inlined_subroutine
will be generated like below:

  0x10f0793f:   DW_TAG_inlined_subroutine
                  DW_AT_name      ("flow_offload_fill_route")
                  DW_AT_linkage_name      ("flow_offload_fill_route.llvm.14555965973926298225")
                  DW_AT_artificial        (true)
                  DW_AT_specification     (0x10ee9e54 "flow_offload_fill_route")

  0x10f07949:     DW_TAG_formal_parameter
                    DW_AT_name    ("flow")
                    DW_AT_type    (0x10ee837a "flow_offload *")

  0x10f07951:     DW_TAG_formal_parameter
                    DW_AT_name    ("route")
                    DW_AT_type    (0x10eea4ef "nf_flow_route *")

  0x10f07959:     DW_TAG_formal_parameter
                    DW_AT_name    ("dir")
                    DW_AT_type    (0x10ecef15 "int")

  0x10f07961:     NULL

In the above, function "flow_offload_fill_route" has return type
"int" at source level, but optimization eventually made the return
type as "void".

Note that it is possible one source symbol may have multiple linkage
name's due to potentially (more than one) cloning in llvm. In such
cases, multiple DW_TAG_inlined_subroutine instances might be possible.

Some restrictions

There are some restrictions in the current implementation:

  • Only C language is supported
  • BPF target is excluded as one of main goals for this pull request
    is to generate proper vmlinux BTF for arch's like x86_64/arm64 etc.
  • Function must not be a intrinsic, decl only, return value size more
    than arch register size and func with variable arguments.
  • For arguments, only int/ptr types are supported.
  • Some union type arguments (e.g., 8B < union_size <= 16B) may
    have DIType issue so some function may be skipped.

Some statistics with linux kernel

I have tested this patch set by building latest bpf-next linux kernel.
For no-lto case:

  65341 original number of functions
  1054  signature changed functions with this patch

For thin-lto case:

  65595 original number of functions
  3150  signature changed functions with this patch

Next step

With this llvm change, we will be able to do some work in pahole and libbpf.
For pahole, currently we will see the warning:

  die__process_unit: DW_TAG_inlined_subroutine (0x1d) @ <0xf2db986> not handled in a c11 CU!

Basically these DW_TAG_inlined_subroutine are not inside the DISubprogram.

[1] #127855
[2] #157349
[3] https://discourse.llvm.org/t/rfc-identify-func-signature-change-in-llvm-compiled-kernel-image/82609

@llvmbot
Copy link
Member

llvmbot commented Oct 27, 2025

@llvm/pr-subscribers-debuginfo

@llvm/pr-subscribers-llvm-transforms

Author: None (yonghong-song)

Changes

Add a new pass EmitChangedFuncDebugInfo which will add dwarf for
additional functions whose signatures are changed during compiler
transformations.

The original intention is for bpf-based linux kernel tracing.
The function signature is available in vmlinux BTF generated
from pahole/dwarf. Such signature is generated from dwarf
at the source level. But this is not ideal since some function
may have signatures changed. If user still used the source
level signature, users may not get correct results and may
need some efforts to workaround the issue.

So we want to encode the true signature (not different
from the source one) in dwarf. With such additional information,
dwarf users can get these signature changed functions.
For example, pahole is able to process these signature
changed functions and encode them into vmlinux BTF properly.

History of multiple attempts

Previously I have attempted a few tries ([1], [2] and [3]).
Initially I tried to modify debuginfo in passes like
ArgPromotion and DeadArgElim, but later on it is suggested
to have a central place to handle new signatures ([1]).

Later, I have another version of patch similar to this
one, but the recommendation is to modify debuginfo to
encode new signature within the same function,
either through inlinedAt or new signature overwriting
the old one. This seems working but it has some
side effect on lldb, some lldb output (e.g. back trace)
will be different from the previous one. The recommendation
is to avoid any behavior change for lldb ([2] and [3]).

So now, I came back to the solution discussed at the
end of [1]. Basically a special dwarf entry will be generated
to encode the new signature. The new signature will have
a reference to the old source-level signature.
So the tool can inspect dwarf to retrieve the related
info.

Examples and dwarf output

In below, a few examples will show how changed signatures
represented in dwarf:

Example 1

Source:

  $ cat test.c
  struct t { int a; };
  char *tar(struct t *a, struct t *d);
  __attribute__((noinline)) static char * foo(struct t *a, int b, struct t *d)
  {
    return tar(a, d);
  }
  char *bar(struct t *a, struct t *d)
  {
    return foo(a, 1, d);
  }

Compiled and dump dwarf with:

  $ clang -O2 -c -g test.c
  $ llvm-dwarfdump test.o
  0x0000000c: DW_TAG_compile_unit
                ...
  0x0000005c:   DW_TAG_subprogram
                  DW_AT_low_pc    (0x0000000000000010)
                  DW_AT_high_pc   (0x0000000000000015)
                  DW_AT_frame_base        (DW_OP_reg7 RSP)
                  DW_AT_call_all_calls    (true)
                  DW_AT_name      ("foo")
                  DW_AT_decl_file ("/home/yhs/tests/sig-change/deadarg/test.c")
                  DW_AT_decl_line (3)
                  DW_AT_prototyped        (true)
                  DW_AT_calling_convention        (DW_CC_nocall)
                  DW_AT_type      (0x000000b1 "char *")

  0x0000006c:     DW_TAG_formal_parameter
                    DW_AT_location        (DW_OP_reg5 RDI)
                    DW_AT_name    ("a")
                    DW_AT_decl_file       ("/home/yhs/tests/sig-change/deadarg/test.c")
                    DW_AT_decl_line       (3)
                    DW_AT_type    (0x000000ba "t *")

  0x00000076:     DW_TAG_formal_parameter
                    DW_AT_name    ("b")
                    DW_AT_decl_file       ("/home/yhs/tests/sig-change/deadarg/test.c")
                    DW_AT_decl_line       (3)
                    DW_AT_type    (0x000000ce "int")

  0x0000007e:     DW_TAG_formal_parameter
                    DW_AT_location        (DW_OP_reg4 RSI)
                    DW_AT_name    ("d")
                    DW_AT_decl_file       ("/home/yhs/tests/sig-change/deadarg/test.c")
                    DW_AT_decl_line       (3)
                    DW_AT_type    (0x000000ba "t *")

  0x00000088:     DW_TAG_call_site
                    ...

  0x0000009d:     NULL
                  ...
  0x000000d2:   DW_TAG_inlined_subroutine
                  DW_AT_name      ("foo")
                  DW_AT_type      (0x000000b1 "char *")
                  DW_AT_artificial        (true)
                  DW_AT_specification     (0x0000005c "foo")

  0x000000dc:     DW_TAG_formal_parameter
                    DW_AT_name    ("a")
                    DW_AT_type    (0x000000ba "t *")

  0x000000e2:     DW_TAG_formal_parameter
                    DW_AT_name    ("d")
                    DW_AT_type    (0x000000ba "t *")

  0x000000e8:     NULL

In the above, the DISubprogram 'foo' has the original signature but
since parameter 'b' does not have DW_AT_location, it is clear that
parameter will not be used. The actual function signature is represented
in DW_TAG_inlined_subroutine.

For the above case, it looks like DW_TAG_inlined_subroutine is not
necessary. Let us try a few other examples below.

Example 2

Source:

  $ cat test.c
  struct t { long a; long b;};
  __attribute__((noinline)) static long foo(struct t arg) {
    return arg.b * 5;
  }
  long bar(struct t arg) {
    return foo(arg);
  }

Compiled and dump dwarf with:

  $ clang -O2 -c -g test.c
  $ llvm-dwarfdump test.o
  ...
  0x0000004e:   DW_TAG_subprogram
                  DW_AT_low_pc    (0x0000000000000010)
                  DW_AT_high_pc   (0x0000000000000015)
                  DW_AT_frame_base        (DW_OP_reg7 RSP)
                  DW_AT_call_all_calls    (true)
                  DW_AT_name      ("foo")
                  DW_AT_decl_file ("/home/yhs/tests/sig-change/struct/test.c")
                  DW_AT_decl_line (2)
                  DW_AT_prototyped        (true)
                  DW_AT_calling_convention        (DW_CC_nocall)
                  DW_AT_type      (0x0000006d "long")

  0x0000005e:     DW_TAG_formal_parameter
                    DW_AT_location        (DW_OP_piece 0x8, DW_OP_reg5 RDI, DW_OP_piece 0x8)
                    DW_AT_name    ("arg")
                    DW_AT_decl_file       ("/home/yhs/tests/sig-change/struct/test.c")
                    DW_AT_decl_line       (2)
                    DW_AT_type    (0x00000099 "t")

  0x0000006c:     NULL
  ...
  0x00000088:   DW_TAG_inlined_subroutine
                  DW_AT_name      ("foo")
                  DW_AT_type      (0x0000006d "long")
                  DW_AT_artificial        (true)
                  DW_AT_specification     (0x0000004e "foo")

  0x00000092:     DW_TAG_formal_parameter
                    DW_AT_name    ("arg")
                    DW_AT_type    (0x0000006d "long")

  0x00000098:     NULL

In the above case for function foo(), the original argument is 'struct t',
but the final actual argument is a 'long' type. DW_TAG_inlined_subroutine
can clearly represent the signature type instead of doing DW_AT_location
thing.

There is a problem in the above then, it is not clear what formal parameter
'arg' corresponds to the original parameter. If necessary, the compiler
could change 'arg' to e.g. 'arg_offset_8' to indicate it is 8 byte offset from
the original struct.

Example 3

Source:

  $ cat test2.c
  struct t { long a; long b; long c;};
  __attribute__((noinline)) long foo(struct t arg) {
    return arg.a * arg.c;
  }
  long bar(struct t arg) {
    return foo(arg);
  }

Compiled and dump dwarf with:

  $ clang -O2 -c -g test2.c
  $ llvm-dwarfdump test2.o
  ...
  0x0000003e:   DW_TAG_subprogram
                  DW_AT_low_pc    (0x0000000000000010)
                  DW_AT_high_pc   (0x0000000000000015)
                  DW_AT_frame_base        (DW_OP_reg7 RSP)
                  DW_AT_call_all_calls    (true)
                  DW_AT_name      ("bar")
                  DW_AT_decl_file ("/home/yhs/tests/sig-change/struct/test2.c")
                  DW_AT_decl_line (5)
                  DW_AT_prototyped        (true)
                  DW_AT_type      (0x0000005f "long")
                  DW_AT_external  (true)

  0x0000004d:     DW_TAG_formal_parameter
                    DW_AT_location        (DW_OP_fbreg +8)
                    DW_AT_name    ("arg")
                    DW_AT_decl_file       ("/home/yhs/tests/sig-change/struct/test2.c")
                    DW_AT_decl_line       (5)
                    DW_AT_type    (0x00000079 "t")

  0x00000058:     DW_TAG_call_site
                    DW_AT_call_origin     (0x00000023 "foo")
                    DW_AT_call_tail_call  (true)
                    DW_AT_call_pc (0x0000000000000010)

  0x0000005e:     NULL
                ...
  0x00000063:   DW_TAG_inlined_subroutine
                  DW_AT_name      ("foo")
                  DW_AT_type      (0x0000005f "long")
                  DW_AT_artificial        (true)
                  DW_AT_specification     (0x00000023 "foo")

  0x0000006d:     DW_TAG_formal_parameter
                    DW_AT_name    ("arg")
                    DW_AT_type    (0x00000074 "t *")

  0x00000073:     NULL

In the above example, from DW_TAG_subprogram, it is not clear what kind
of type the parameter should be. But DW_TAG_inlined_subroutine can
clearly show what the type should be. Again, the name can be changed
e.g. 'arg_ptr' if desired.

Example 4

Source:

  $ cat test.c
  __attribute__((noinline)) static int callee(const int *p) { return *p + 42; }
  int caller(void) {
    int x = 100;
    return callee(&amp;x);
  }

Compiled and dump dwarf with:

  $ clang -O3 -c -g test2.c
  $ llvm-dwarfdump test2.o
  ...
  0x0000004a:   DW_TAG_subprogram
                  DW_AT_low_pc    (0x0000000000000010)
                  DW_AT_high_pc   (0x0000000000000014)
                  DW_AT_frame_base        (DW_OP_reg7 RSP)
                  DW_AT_call_all_calls    (true)
                  DW_AT_name      ("callee")
                  DW_AT_decl_file ("/home/yhs/tests/sig-change/prom/test.c")
                  DW_AT_decl_line (1)
                  DW_AT_prototyped        (true)
                  DW_AT_calling_convention        (DW_CC_nocall)
                  DW_AT_type      (0x00000063 "int")

  0x0000005a:     DW_TAG_formal_parameter
                    DW_AT_name    ("p")
                    DW_AT_decl_file       ("/home/yhs/tests/sig-change/prom/test.c")
                    DW_AT_decl_line       (1)
                    DW_AT_type    (0x00000078 "const int *")

  0x00000062:     NULL
                ...
  0x00000067:   DW_TAG_inlined_subroutine
                  DW_AT_name      ("callee")
                  DW_AT_type      (0x00000063 "int")
                  DW_AT_artificial        (true)
                  DW_AT_specification     (0x0000004a "callee")

  0x00000071:     DW_TAG_formal_parameter
                    DW_AT_name    ("__0")
                    DW_AT_type    (0x00000063 "int")

  0x00000077:     NULL

In the above, the function

  static int callee(const int *p) { return *p + 42; }

is transformed to

  static int callee(int p) { return p + 42; }

But the new signature is not reflected in DW_TAG_subprogram.
The DW_TAG_inlined_subroutine can precisely capture the
signature. Note that the parameter name is "__0" and "0" means
the first argument. The reason is due to the following IR:

  define internal ... i32 @<!-- -->callee(i32 %0) unnamed_addr #<!-- -->1 !dbg !23 {
      #dbg_value(ptr poison, !29, !DIExpression(), !30)
    %2 = add nsw i32 %0, 42, !dbg !31
    ret i32 %2, !dbg !32
  }
  ...
  !29 = !DILocalVariable(name: "p", arg: 1, scope: !23, file: !1, line: 1, type: !26)

The reason is due to 'ptr poison' as 'ptr poison' mean the debug
value should not be used any more. This is also the reason that
the above DW_TAG_subprogram does not have location information.
DW_TAG_inlined_subroutine can provide correct signature though.

If we compile like below:

  clang -O3 -c -g test.c -fno-discard-value-names

The function argument name will be preserved

  ... i32 @<!-- -->callee(i32 %p.0.val) ...

and in such cases,
the DW_TAG_inlined_subroutine looks like below:

  0x00000067:   DW_TAG_inlined_subroutine
                  DW_AT_name      ("callee")
                  DW_AT_type      (0x00000063 "int")
                  DW_AT_artificial        (true)
                  DW_AT_specification     (0x0000004a "callee")

  0x00000071:     DW_TAG_formal_parameter
                    DW_AT_name    ("p__0__val")
                    DW_AT_type    (0x00000063 "int")

  0x00000077:     NULL

Note that the original argument name replaces '.' with "__"
so argument name has proper C standard.

Based a run on linux kernel, the names like "__<arg_index>"
roughly 2% of total signature changed functions, so we probably
okay for now.

Non-LTO vs. LTO

For thin-lto mode, we often see kernel symbols like

  p9_req_cache.llvm.13472271643223911678

If this symbol has identical source level signature with p9_req_cache,
then a special DW_TAG_inlined_subroutine will not be generated.

But if a symbol with "<foo>.llvm.<hash>" has different signatures
than the source level "<foo>", then a special DW_TAG_inlined_subroutine
will be generated like below:

  0x10f0793f:   DW_TAG_inlined_subroutine
                  DW_AT_name      ("flow_offload_fill_route")
                  DW_AT_linkage_name      ("flow_offload_fill_route.llvm.14555965973926298225")
                  DW_AT_artificial        (true)
                  DW_AT_specification     (0x10ee9e54 "flow_offload_fill_route")

  0x10f07949:     DW_TAG_formal_parameter
                    DW_AT_name    ("flow")
                    DW_AT_type    (0x10ee837a "flow_offload *")

  0x10f07951:     DW_TAG_formal_parameter
                    DW_AT_name    ("route")
                    DW_AT_type    (0x10eea4ef "nf_flow_route *")

  0x10f07959:     DW_TAG_formal_parameter
                    DW_AT_name    ("dir")
                    DW_AT_type    (0x10ecef15 "int")

  0x10f07961:     NULL

In the above, function "flow_offload_fill_route" has return type
"int" at source level, but optimization eventually made the return
type as "void".

Note that it is possible one source symbol may have multiple linkage
name's due to potentially (more than one) cloning in llvm. In such
cases, multiple DW_TAG_inlined_subroutine instances might be possible.

Some restrictions

There are some restrictions in the current implementation:

  • Only C language is supported
  • BPF target is excluded as one of main goals for this pull request
    is to generate proper vmlinux BTF for arch's like x86_64/arm64 etc.
  • Function must not be a intrinsic, decl only, return value size more
    than arch register size and func with variable arguments.
  • For arguments, only int/float/ptr types are supported.

Some statistics with linux kernel

I have tested this patch set by building latest bpf-next linux kernel.
For no-lto case:

  65341 original number of functions
  1054  signature changed functions with this patch

For thin-lto case:

  65595 original number of functions
  1323  signature changed functions with this patch

Next step

With this llvm change, we will be able to do some work in pahole and libbpf.
For pahole, currently we will see the warning:

  die__process_unit: DW_TAG_inlined_subroutine (0x1d) @ &lt;0xf2db986&gt; not handled in a c11 CU!

Basically these DW_TAG_inlined_subroutine are not inside the DISubprogram.

[1] #127855
[2] #157349
[3] https://discourse.llvm.org/t/rfc-identify-func-signature-change-in-llvm-compiled-kernel-image/82609


Patch is 79.32 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/165310.diff

22 Files Affected:

  • (added) llvm/include/llvm/Transforms/Utils/EmitChangedFuncDebugInfo.h (+33)
  • (modified) llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp (+65)
  • (modified) llvm/lib/CodeGen/AsmPrinter/DwarfDebug.h (+2)
  • (modified) llvm/lib/Passes/PassBuilder.cpp (+1)
  • (modified) llvm/lib/Passes/PassBuilderPipelines.cpp (+6-2)
  • (modified) llvm/lib/Passes/PassRegistry.def (+1)
  • (modified) llvm/lib/Transforms/IPO/ArgumentPromotion.cpp (+11)
  • (modified) llvm/lib/Transforms/Utils/CMakeLists.txt (+1)
  • (added) llvm/lib/Transforms/Utils/EmitChangedFuncDebugInfo.cpp (+510)
  • (modified) llvm/test/Other/new-pm-defaults.ll (+2)
  • (modified) llvm/test/Other/new-pm-thinlto-postlink-defaults.ll (+1)
  • (modified) llvm/test/Other/new-pm-thinlto-postlink-pgo-defaults.ll (+1)
  • (modified) llvm/test/Other/new-pm-thinlto-postlink-samplepgo-defaults.ll (+1)
  • (modified) llvm/test/Transforms/ArgumentPromotion/dbg.ll (+5-1)
  • (added) llvm/test/Transforms/Util/changed-func-dbg-argpromotion-dwarf.ll (+81)
  • (added) llvm/test/Transforms/Util/changed-func-dbg-argpromotion.ll (+96)
  • (added) llvm/test/Transforms/Util/changed-func-dbg-deadarg-dwarf.ll (+106)
  • (added) llvm/test/Transforms/Util/changed-func-dbg-deadarg.ll (+126)
  • (added) llvm/test/Transforms/Util/changed-func-dbg-struct-16B-dwarf.ll (+72)
  • (added) llvm/test/Transforms/Util/changed-func-dbg-struct-16B.ll (+71)
  • (added) llvm/test/Transforms/Util/changed-func-dbg-struct-large-dwarf.ll (+79)
  • (added) llvm/test/Transforms/Util/changed-func-dbg-struct-large.ll (+94)
diff --git a/llvm/include/llvm/Transforms/Utils/EmitChangedFuncDebugInfo.h b/llvm/include/llvm/Transforms/Utils/EmitChangedFuncDebugInfo.h
new file mode 100644
index 0000000000000..8d569cd95d7f7
--- /dev/null
+++ b/llvm/include/llvm/Transforms/Utils/EmitChangedFuncDebugInfo.h
@@ -0,0 +1,33 @@
+//===- EmitChangedFuncDebugInfo.h - Emit Additional Debug Info -*- C++ --*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+//
+/// \file
+/// Emit debug info for changed or new funcs.
+//===----------------------------------------------------------------------===//
+
+#ifndef LLVM_TRANSFORMS_UTILS_EMITCHANGEDFUNCDEBUGINFO_H
+#define LLVM_TRANSFORMS_UTILS_EMITCHANGEDFUNCDEBUGINFO_H
+
+#include "llvm/IR/PassManager.h"
+
+namespace llvm {
+
+class Module;
+
+// Pass that emits late dwarf.
+class EmitChangedFuncDebugInfoPass
+    : public PassInfoMixin<EmitChangedFuncDebugInfoPass> {
+public:
+  EmitChangedFuncDebugInfoPass() = default;
+
+  PreservedAnalyses run(Module &M, ModuleAnalysisManager &AM);
+};
+
+} // end namespace llvm
+
+#endif // LLVM_TRANSFORMS_UTILS_EMITCHANGEDFUNCDEBUGINFO_H
diff --git a/llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp b/llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp
index 567acf75d1b8d..b10660d71b3a5 100644
--- a/llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp
+++ b/llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp
@@ -1280,11 +1280,76 @@ void DwarfDebug::finishSubprogramDefinitions() {
   }
 }
 
+void DwarfDebug::addChangedSubprograms() {
+  // Generate additional dwarf for functions with signature changed.
+  DICompileUnit *ExtraCU = nullptr;
+  for (DICompileUnit *CUNode : MMI->getModule()->debug_compile_units()) {
+    if (CUNode->getFile()->getFilename() == "<changed_signatures>") {
+      ExtraCU = CUNode;
+      break;
+    }
+  }
+  if (!ExtraCU)
+    return;
+
+  llvm::DebugInfoFinder DIF;
+  DIF.processModule(*MMI->getModule());
+  for (auto *ExtraSP : DIF.subprograms()) {
+    if (ExtraSP->getUnit() != ExtraCU)
+      continue;
+
+    DISubprogram *SP = cast<DISubprogram>(ExtraSP->getScope());
+    DwarfCompileUnit &Cu = getOrCreateDwarfCompileUnit(SP->getUnit());
+    DIE *ScopeDIE =
+        DIE::get(DIEValueAllocator, dwarf::DW_TAG_inlined_subroutine);
+    Cu.getUnitDie().addChild(ScopeDIE);
+
+    Cu.addString(*ScopeDIE, dwarf::DW_AT_name, ExtraSP->getName());
+    if (ExtraSP->getLinkageName() != ExtraSP->getName())
+      Cu.addString(*ScopeDIE, dwarf::DW_AT_linkage_name, ExtraSP->getLinkageName());
+
+    DITypeRefArray Args = ExtraSP->getType()->getTypeArray();
+
+    if (Args[0])
+      Cu.addType(*ScopeDIE, Args[0]);
+
+    if (ExtraSP->getType()->getCC() == llvm::dwarf::DW_CC_nocall) {
+      Cu.addUInt(*ScopeDIE, dwarf::DW_AT_calling_convention,
+                 dwarf::DW_FORM_data1, llvm::dwarf::DW_CC_nocall);
+    }
+
+    Cu.addFlag(*ScopeDIE, dwarf::DW_AT_artificial);
+
+    // dereference the DIE* for DIEEntry
+    DIE *OriginDIE = Cu.getOrCreateSubprogramDIE(SP, nullptr);
+    Cu.addDIEEntry(*ScopeDIE, dwarf::DW_AT_specification, DIEEntry(*OriginDIE));
+
+    SmallVector<const DILocalVariable *> ArgVars(Args.size());
+    for (const DINode *DN : ExtraSP->getRetainedNodes()) {
+      if (const auto *DV = dyn_cast<DILocalVariable>(DN)) {
+        uint32_t Arg = DV->getArg();
+        if (Arg)
+          ArgVars[Arg - 1] = DV;
+      }
+    }
+
+    for (unsigned i = 1, N = Args.size(); i < N; ++i) {
+      const DIType *Ty = Args[i];
+      DIE &Arg = Cu.createAndAddDIE(dwarf::DW_TAG_formal_parameter, *ScopeDIE);
+      const DILocalVariable *DV = ArgVars[i - 1];
+      Cu.addString(Arg, dwarf::DW_AT_name, DV->getName());
+      Cu.addType(Arg, Ty);
+    }
+  }
+}
+
 void DwarfDebug::finalizeModuleInfo() {
   const TargetLoweringObjectFile &TLOF = Asm->getObjFileLowering();
 
   finishSubprogramDefinitions();
 
+  addChangedSubprograms();
+
   finishEntityDefinitions();
 
   bool HasEmittedSplitCU = false;
diff --git a/llvm/lib/CodeGen/AsmPrinter/DwarfDebug.h b/llvm/lib/CodeGen/AsmPrinter/DwarfDebug.h
index 1a1b28a6fc035..414abd4c7b8cf 100644
--- a/llvm/lib/CodeGen/AsmPrinter/DwarfDebug.h
+++ b/llvm/lib/CodeGen/AsmPrinter/DwarfDebug.h
@@ -565,6 +565,8 @@ class DwarfDebug : public DebugHandlerBase {
 
   void finishSubprogramDefinitions();
 
+  void addChangedSubprograms();
+
   /// Finish off debug information after all functions have been
   /// processed.
   void finalizeModuleInfo();
diff --git a/llvm/lib/Passes/PassBuilder.cpp b/llvm/lib/Passes/PassBuilder.cpp
index 3c9a27ac24015..c43fc5a215468 100644
--- a/llvm/lib/Passes/PassBuilder.cpp
+++ b/llvm/lib/Passes/PassBuilder.cpp
@@ -351,6 +351,7 @@
 #include "llvm/Transforms/Utils/DXILUpgrade.h"
 #include "llvm/Transforms/Utils/Debugify.h"
 #include "llvm/Transforms/Utils/DeclareRuntimeLibcalls.h"
+#include "llvm/Transforms/Utils/EmitChangedFuncDebugInfo.h"
 #include "llvm/Transforms/Utils/EntryExitInstrumenter.h"
 #include "llvm/Transforms/Utils/FixIrreducible.h"
 #include "llvm/Transforms/Utils/HelloWorld.h"
diff --git a/llvm/lib/Passes/PassBuilderPipelines.cpp b/llvm/lib/Passes/PassBuilderPipelines.cpp
index bd03ac090721c..0ee2efbd91541 100644
--- a/llvm/lib/Passes/PassBuilderPipelines.cpp
+++ b/llvm/lib/Passes/PassBuilderPipelines.cpp
@@ -135,6 +135,7 @@
 #include "llvm/Transforms/Utils/AssumeBundleBuilder.h"
 #include "llvm/Transforms/Utils/CanonicalizeAliases.h"
 #include "llvm/Transforms/Utils/CountVisits.h"
+#include "llvm/Transforms/Utils/EmitChangedFuncDebugInfo.h"
 #include "llvm/Transforms/Utils/EntryExitInstrumenter.h"
 #include "llvm/Transforms/Utils/ExtraPassManager.h"
 #include "llvm/Transforms/Utils/InjectTLIMappings.h"
@@ -1637,9 +1638,12 @@ PassBuilder::buildModuleOptimizationPipeline(OptimizationLevel Level,
   if (PTO.CallGraphProfile && !LTOPreLink)
     MPM.addPass(CGProfilePass(isLTOPostLink(LTOPhase)));
 
-  // RelLookupTableConverterPass runs later in LTO post-link pipeline.
-  if (!LTOPreLink)
+  // RelLookupTableConverterPass and EmitChangedFuncDebugInfoPass run later in
+  // LTO post-link pipeline.
+  if (!LTOPreLink) {
     MPM.addPass(RelLookupTableConverterPass());
+    MPM.addPass(EmitChangedFuncDebugInfoPass());
+  }
 
   return MPM;
 }
diff --git a/llvm/lib/Passes/PassRegistry.def b/llvm/lib/Passes/PassRegistry.def
index 1853cdd45d0ee..91aeab54be333 100644
--- a/llvm/lib/Passes/PassRegistry.def
+++ b/llvm/lib/Passes/PassRegistry.def
@@ -75,6 +75,7 @@ MODULE_PASS("dfsan", DataFlowSanitizerPass())
 MODULE_PASS("dot-callgraph", CallGraphDOTPrinterPass())
 MODULE_PASS("dxil-upgrade", DXILUpgradePass())
 MODULE_PASS("elim-avail-extern", EliminateAvailableExternallyPass())
+MODULE_PASS("emit-changed-func-debuginfo", EmitChangedFuncDebugInfoPass())
 MODULE_PASS("extract-blocks", BlockExtractorPass({}, false))
 MODULE_PASS("expand-variadics",
             ExpandVariadicsPass(ExpandVariadicsMode::Disable))
diff --git a/llvm/lib/Transforms/IPO/ArgumentPromotion.cpp b/llvm/lib/Transforms/IPO/ArgumentPromotion.cpp
index 262c902d40d2d..87b0d069ec04e 100644
--- a/llvm/lib/Transforms/IPO/ArgumentPromotion.cpp
+++ b/llvm/lib/Transforms/IPO/ArgumentPromotion.cpp
@@ -50,6 +50,7 @@
 #include "llvm/IR/BasicBlock.h"
 #include "llvm/IR/CFG.h"
 #include "llvm/IR/Constants.h"
+#include "llvm/IR/DIBuilder.h"
 #include "llvm/IR/DataLayout.h"
 #include "llvm/IR/DerivedTypes.h"
 #include "llvm/IR/Dominators.h"
@@ -432,6 +433,16 @@ doPromotion(Function *F, FunctionAnalysisManager &FAM,
     PromoteMemToReg(Allocas, DT, &AC);
   }
 
+  // If argument(s) are dead (hence removed) or promoted, probably the function
+  // does not follow standard calling convention anymore. Add DW_CC_nocall to
+  // DISubroutineType to inform debugger that it may not be safe to call this
+  // function.
+  DISubprogram *SP = NF->getSubprogram();
+  if (SP) {
+    auto Temp = SP->getType()->cloneWithCC(llvm::dwarf::DW_CC_nocall);
+    SP->replaceType(MDNode::replaceWithPermanent(std::move(Temp)));
+  }
+
   return NF;
 }
 
diff --git a/llvm/lib/Transforms/Utils/CMakeLists.txt b/llvm/lib/Transforms/Utils/CMakeLists.txt
index f367ca2fdf56b..72291a0c7d8b0 100644
--- a/llvm/lib/Transforms/Utils/CMakeLists.txt
+++ b/llvm/lib/Transforms/Utils/CMakeLists.txt
@@ -23,6 +23,7 @@ add_llvm_component_library(LLVMTransformUtils
   DebugSSAUpdater.cpp
   DeclareRuntimeLibcalls.cpp
   DemoteRegToStack.cpp
+  EmitChangedFuncDebugInfo.cpp
   DXILUpgrade.cpp
   EntryExitInstrumenter.cpp
   EscapeEnumerator.cpp
diff --git a/llvm/lib/Transforms/Utils/EmitChangedFuncDebugInfo.cpp b/llvm/lib/Transforms/Utils/EmitChangedFuncDebugInfo.cpp
new file mode 100644
index 0000000000000..46ef9471e31ad
--- /dev/null
+++ b/llvm/lib/Transforms/Utils/EmitChangedFuncDebugInfo.cpp
@@ -0,0 +1,510 @@
+//==- EmitChangedFuncDebugInfoPass - Emit Additional Debug Info -*- C++ -*-==//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+//
+// This pass synthesizes a "shadow" DISubprogram carrying a *possibly changed*
+// signature for certain optimized functions. The new subprogram lives in a
+// dedicated DICompileUnit whose file name is "<changed_signatures>", and is
+// attached to a dummy AvailableExternally function so that the metadata forms
+// a valid graph.
+//
+// When we can recover argument names/types from dbg records in the entry
+// block, we do so; otherwise we conservatively fall back to pointer- or
+// integer-typed parameters.
+//
+// We *only* run for C-family source languages, skip BPF targets (BTF is used
+// there), skip varargs originals, and skip functions whose return type is a
+// large by-value aggregate.
+//
+//===----------------------------------------------------------------------===//
+
+#include "llvm/Transforms/Utils/EmitChangedFuncDebugInfo.h"
+
+#include "llvm/IR/DIBuilder.h"
+#include "llvm/IR/IRBuilder.h"
+#include "llvm/IR/IntrinsicInst.h"
+#include "llvm/IR/Module.h"
+#include "llvm/Support/CommandLine.h"
+#include "llvm/TargetParser/Triple.h"
+
+using namespace llvm;
+
+/// Disable switch.
+static cl::opt<bool> DisableChangedFuncDBInfo(
+    "disable-changed-func-dbinfo", cl::Hidden, cl::init(false),
+    cl::desc("Disable debuginfo emission for changed func signatures"));
+
+/// Replace all '.' with "__" (stable with opaque-lifetime inputs).
+static std::string sanitizeDots(StringRef S) {
+  std::string Out = S.str();
+  for (size_t pos = 0; (pos = Out.find('.', pos)) != std::string::npos;
+       pos += 2)
+    Out.replace(pos, 1, "__");
+  return Out;
+}
+
+/// Return the "basename" (prefix before the first '.') of a name.
+static StringRef baseBeforeDot(StringRef S) {
+  return S.take_front(S.find('.'));
+}
+
+/// Ensure a variable name is unique among previously recorded parameters.
+/// If collision, append "__<Idx>".
+static std::string uniquifyParamName(StringRef Candidate,
+                                     ArrayRef<Metadata *> Existing,
+                                     unsigned Idx) {
+  for (unsigned i = 0; i < Existing.size(); ++i)
+    if (auto *LV = dyn_cast<DILocalVariable>(Existing[i]))
+      if (LV->getName() == Candidate)
+        return (Twine(Candidate) + "__" + Twine(Idx)).str();
+  return Candidate.str();
+}
+
+/// Walk backward in the current block to see whether LocV is exactly a
+/// zext/trunc of Arg (used by two separate match sites originally).
+static bool comesFromArgViaCast(Value *LocV, Argument *Arg, Instruction &At) {
+  if (!LocV)
+    return false;
+  for (Instruction *Prev = At.getPrevNode(); Prev; Prev = Prev->getPrevNode()) {
+    // FIXME: maybe some other insns need check as well.
+    if (auto *Z = dyn_cast<ZExtInst>(Prev))
+      if (Z->getOperand(0) == Arg && LocV == Prev)
+        return true;
+    if (auto *T = dyn_cast<TruncInst>(Prev))
+      if (T->getOperand(0) == Arg && LocV == Prev)
+        return true;
+  }
+  return false;
+}
+
+/// Strip qualifiers/typedefs until the first pointer-type (which we keep), or
+/// to the base non-derived type if no pointer is found.
+static DIType *stripToBaseOrFirstPointer(DIType *T) {
+  while (auto *DT = dyn_cast_or_null<DIDerivedType>(T)) {
+    if (DT->getTag() == dwarf::DW_TAG_pointer_type)
+      return DT;
+    T = DT->getBaseType();
+  }
+  return T;
+}
+
+static DIType *createBasicType(DIBuilder &DIB, uint64_t SizeInBits) {
+  switch (SizeInBits) {
+  case 8:
+    return DIB.createBasicType("char", 8, dwarf::DW_ATE_signed);
+  case 16:
+    return DIB.createBasicType("short", 16, dwarf::DW_ATE_signed);
+  case 32:
+    return DIB.createBasicType("int", 32, dwarf::DW_ATE_signed);
+  case 64:
+    return DIB.createBasicType("long long", 64, dwarf::DW_ATE_signed);
+  default:
+    return DIB.createBasicType("__int128", SizeInBits, dwarf::DW_ATE_signed);
+  }
+}
+
+static DIType *createFloatType(DIBuilder &DIB, uint64_t SizeInBits) {
+  if (SizeInBits == 32)
+    return DIB.createBasicType("float", 32, dwarf::DW_ATE_float);
+  if (SizeInBits == 64)
+    return DIB.createBasicType("double", 64, dwarf::DW_ATE_float);
+  return DIB.createBasicType("long double", SizeInBits, dwarf::DW_ATE_float);
+}
+
+static DIType *getIntTypeFromExpr(DIBuilder &DIB, DIExpression *Expr,
+                                  DICompositeType *DTy, unsigned W) {
+  for (auto Op : Expr->expr_ops()) {
+    if (Op.getOp() != dwarf::DW_OP_LLVM_fragment)
+      break;
+
+    const uint64_t BitOffset = Op.getArg(0);
+    const uint64_t BitSize = Op.getArg(1);
+    const uint64_t BitUpLimit = BitOffset + BitSize;
+
+    DINodeArray Elems = DTy->getElements();
+    unsigned N = Elems.size();
+
+    for (unsigned i = 0; i < N; ++i)
+      if (auto *Elem = dyn_cast<DIDerivedType>(Elems[i])) {
+        if (N >= 2 && i < N - 1) {
+          if (Elem->getOffsetInBits() <= BitOffset &&
+              BitUpLimit <= (Elem->getOffsetInBits() + Elem->getSizeInBits()))
+            return Elem->getBaseType();
+        } else {
+          if (Elem->getOffsetInBits() <= BitOffset &&
+              BitUpLimit <= DTy->getSizeInBits())
+            return Elem->getBaseType();
+        }
+      }
+
+    return createBasicType(DIB, BitSize);
+  }
+  return createBasicType(DIB, W);
+}
+
+static DIType *computeParamDIType(DIBuilder &DIB, Type *Ty, DIType *Orig,
+                                  unsigned PointerBitWidth,
+                                  DIExpression *Expr) {
+  DIType *Stripped = stripToBaseOrFirstPointer(Orig);
+
+  if (Ty->isIntegerTy()) {
+    unsigned W = cast<IntegerType>(Ty)->getBitWidth();
+    if (auto *Comp = dyn_cast_or_null<DICompositeType>(Stripped)) {
+      if (!Ty->isIntegerTy(Comp->getSizeInBits()))
+        return getIntTypeFromExpr(DIB, Expr, Comp, W);
+    }
+    return createBasicType(DIB, W);
+  }
+
+  if (Ty->isFloatingPointTy())
+    return createFloatType(DIB, Ty->getScalarSizeInBits());
+
+  // Ty->isPointerTy().
+  if (auto *Der = dyn_cast_or_null<DIDerivedType>(Stripped)) {
+    assert(Der->getTag() == dwarf::DW_TAG_pointer_type);
+    return Der;
+  }
+
+  auto *Comp = cast<DICompositeType>(Stripped);
+  return DIB.createPointerType(Comp, PointerBitWidth);
+}
+
+static bool isLargeByValueAggregate(DIType *T, unsigned PtrW) {
+  DIType *P = stripToBaseOrFirstPointer(T);
+  if (auto *Comp = dyn_cast_or_null<DICompositeType>(P))
+    return Comp->getSizeInBits() > PtrW;
+  return false;
+}
+
+static void pushParam(DIBuilder &DIB, DISubprogram *OldSP,
+                      SmallVectorImpl<Metadata *> &TypeList,
+                      SmallVectorImpl<Metadata *> &ArgList, DIType *Ty,
+                      StringRef VarName, unsigned Idx) {
+  TypeList.push_back(Ty);
+  ArgList.push_back(DIB.createParameterVariable(
+      OldSP, VarName, Idx + 1, OldSP->getFile(), OldSP->getLine(), Ty));
+}
+
+/// Argument collection.
+static bool getOneArgDI(unsigned Idx, BasicBlock &Entry, DIBuilder &DIB,
+                        Function *F, DISubprogram *OldSP,
+                        SmallVectorImpl<Metadata *> &TypeList,
+                        SmallVectorImpl<Metadata *> &ArgList,
+                        unsigned PointerBitWidth) {
+  Argument *Arg = F->getArg(Idx);
+  StringRef ArgName = Arg->getName();
+  Type *ArgTy = Arg->getType();
+
+  // If byval struct, remember its identified-name and kind to match via dbg.
+  StringRef ByValUserName;
+  bool IsByValStruct = true;
+  if (ArgTy->isPointerTy() && Arg->hasByValAttr()) {
+    if (Type *ByValTy = F->getParamByValType(Idx))
+      if (auto *ST = dyn_cast<StructType>(ByValTy)) {
+        auto [Kind, Name] = ST->getName().split('.');
+        ByValUserName = Name;
+        IsByValStruct = (Kind == "struct");
+      }
+  }
+
+  DILocalVariable *DIVar = nullptr;
+  DIExpression *DIExpr = nullptr;
+
+  // Scan the entry block for dbg records.
+  for (Instruction &I : Entry) {
+    bool Final = false;
+
+    for (DbgRecord &DR : I.getDbgRecordRange()) {
+      auto *DVR = dyn_cast<DbgVariableRecord>(&DR);
+      if (!DVR)
+        continue;
+
+      auto *VAM = dyn_cast_or_null<ValueAsMetadata>(DVR->getRawLocation());
+      if (!VAM)
+        continue;
+
+      Value *LocV = VAM->getValue();
+      auto *Var = DVR->getVariable();
+      if (!Var || !Var->getArg())
+        continue;
+
+      // Canonicalize through derived types stopping at first pointer.
+      DIType *DITy = Var->getType();
+      while (auto *DTy = dyn_cast<DIDerivedType>(DITy)) {
+        if (DTy->getTag() == dwarf::DW_TAG_pointer_type) {
+          DITy = DTy;
+          break;
+        }
+        DITy = DTy->getBaseType();
+      }
+
+      if (LocV == Arg) {
+        DIVar = Var;
+        DIExpr = DVR->getExpression();
+        Final = true;
+        break;
+      }
+
+      // Compare base names (before dot) in several cases.
+      StringRef ArgBase = baseBeforeDot(ArgName);
+      StringRef VarBase = baseBeforeDot(Var->getName());
+
+      if (ArgName.empty()) {
+        if (!ByValUserName.empty()) {
+          // Match by byval struct DI type’s name/kind.
+          DIType *Stripped = stripToBaseOrFirstPointer(Var->getType());
+          auto *Comp = dyn_cast<DICompositeType>(Stripped);
+          if (!Comp)
+            continue;
+          bool IsStruct = Comp->getTag() == dwarf::DW_TAG_structure_type;
+          if (Comp->getName() != ByValUserName || IsStruct != IsByValStruct)
+            continue;
+          DIVar = Var;
+          DIExpr = DVR->getExpression();
+          Final = true;
+          break;
+        }
+
+        // FIXME: more work is needed to find precise DILocalVariable.
+        if (isa<PoisonValue>(LocV) || isa<AllocaInst>(LocV))
+          continue;
+
+        if (comesFromArgViaCast(LocV, Arg, I)) {
+          DIVar = Var;
+          DIExpr = DVR->getExpression();
+          Final = true;
+          break;
+        }
+      } else {
+        // We do have an IR arg name.
+        if (isa<PoisonValue>(LocV)) {
+          if (Var->getName() != ArgBase)
+            continue;
+          DIVar = Var;
+          DIExpr = DVR->getExpression();
+          // Possibly we may find a non poison value later.
+        } else if (isa<AllocaInst>(LocV)) {
+          if (Var->getName() != ArgName)
+            continue;
+          DIVar = Var;
+          DIExpr = DVR->getExpression();
+          Final = true;
+          break;
+        } else if (ArgBase == VarBase) {
+          DIVar = Var;
+          DIExpr = DVR->getExpression();
+          Final = true;
+          break;
+        } else if (comesFromArgViaCast(LocV, Arg, I)) {
+          DIVar = Var;
+          DIExpr = DVR->getExpression();
+          Final = true;
+          break;
+        }
+      }
+    }
+
+    if (Final)
+      break;
+  }
+
+  // Fallback types if we failed to find a dbg match.
+  if (!DIVar) {
+    // Likely to be a unused parameter.
+    if (ArgTy->isIntegerTy()) {
+      auto *Ty = createBasicType(DIB, cast<IntegerType>(ArgTy)->getBitWidth());
+      pushParam(DIB, OldSP, TypeList, ArgList, Ty,
+                (Twine("__") + Twine(Idx)).str(), Idx);
+      return true;
+    }
+    // Pointer: use void *
+    // Returning false means the ...
[truncated]

@yonghong-song
Copy link
Contributor Author

cc @dafaust @jemarch Please take a look as well. Thanks!

@github-actions
Copy link

github-actions bot commented Oct 27, 2025

✅ With the latest revision this PR passed the C/C++ code formatter.

Yonghong Song added 2 commits November 2, 2025 20:30
ArgumentPromotion pass may change function signatures. If this happens
and debuginfo is enabled, let us add DW_CC_nocall to debuginfo so it is
clear that the function signature has changed.
DeadArgumentElimination ([1]) has similar implementation.

Also fix an ArgumentPromotion test due to adding DW_CC_nocall to
debuginfo.

  [1] llvm@340b0ca
Add a new pass EmitChangedFuncDebugInfo which will add dwarf for
additional functions whose signatures are changed during compiler
transformations.

The original intention is for bpf-based linux kernel tracing.
The function signature is available in vmlinux BTF generated
from pahole/dwarf. Such signature is generated from dwarf
at the source level. But this is not ideal since some function
may have signatures changed. If user still used the source
level signature, users may not get correct results and may
need some efforts to workaround the issue.

So we want to encode the true signature (not different
from the source one) in dwarf. With such additional information,
dwarf users can get these signature changed functions.
For example, pahole is able to process these signature
changed functions and encode them into vmlinux BTF properly.

History of multiple attempts
============================

Previously I have attempted a few tries ([1], [2] and [3]).
Initially I tried to modify debuginfo in passes like
ArgPromotion and DeadArgElim, but later on it is suggested
to have a central place to handle new signatures ([1]).

Later, I have another version of patch similar to this
one, but the recommendation is to modify debuginfo to
encode new signature within the same function,
either through inlinedAt or new signature overwriting
the old one. This seems working but it has some
side effect on lldb, some lldb output (e.g. back trace)
will be different from the previous one. The recommendation
is to avoid any behavior change for lldb ([2] and [3]).

So now, I came back to the solution discussed at the
end of [1]. Basically a special dwarf entry will be generated
to encode the new signature. The new signature will have
a reference to the old source-level signature.
So the tool can inspect dwarf to retrieve the related
info.

Examples and dwarf output
=========================

In below, a few examples will show how changed signatures
represented in dwarf:

Example 1
---------

Source:
  $ cat test.c
  struct t { int a; };
  char *tar(struct t *a, struct t *d);
  __attribute__((noinline)) static char * foo(struct t *a, int b, struct t *d)
  {
    return tar(a, d);
  }
  char *bar(struct t *a, struct t *d)
  {
    return foo(a, 1, d);
  }
Compiled and dump dwarf with:
  $ clang -O2 -c -g test.c
  $ llvm-dwarfdump test.o
  0x0000000c: DW_TAG_compile_unit
                ...
  0x0000005c:   DW_TAG_subprogram
                  DW_AT_low_pc    (0x0000000000000010)
                  DW_AT_high_pc   (0x0000000000000015)
                  DW_AT_frame_base        (DW_OP_reg7 RSP)
                  DW_AT_call_all_calls    (true)
                  DW_AT_name      ("foo")
                  DW_AT_decl_file ("/home/yhs/tests/sig-change/deadarg/test.c")
                  DW_AT_decl_line (3)
                  DW_AT_prototyped        (true)
                  DW_AT_calling_convention        (DW_CC_nocall)
                  DW_AT_type      (0x000000b1 "char *")

  0x0000006c:     DW_TAG_formal_parameter
                    DW_AT_location        (DW_OP_reg5 RDI)
                    DW_AT_name    ("a")
                    DW_AT_decl_file       ("/home/yhs/tests/sig-change/deadarg/test.c")
                    DW_AT_decl_line       (3)
                    DW_AT_type    (0x000000ba "t *")

  0x00000076:     DW_TAG_formal_parameter
                    DW_AT_name    ("b")
                    DW_AT_decl_file       ("/home/yhs/tests/sig-change/deadarg/test.c")
                    DW_AT_decl_line       (3)
                    DW_AT_type    (0x000000ce "int")

  0x0000007e:     DW_TAG_formal_parameter
                    DW_AT_location        (DW_OP_reg4 RSI)
                    DW_AT_name    ("d")
                    DW_AT_decl_file       ("/home/yhs/tests/sig-change/deadarg/test.c")
                    DW_AT_decl_line       (3)
                    DW_AT_type    (0x000000ba "t *")

  0x00000088:     DW_TAG_call_site
                    ...

  0x0000009d:     NULL
                  ...
  0x000000d2:   DW_TAG_inlined_subroutine
                  DW_AT_name      ("foo")
                  DW_AT_type      (0x000000b1 "char *")
                  DW_AT_artificial        (true)
                  DW_AT_specification     (0x0000005c "foo")

  0x000000dc:     DW_TAG_formal_parameter
                    DW_AT_name    ("a")
                    DW_AT_type    (0x000000ba "t *")

  0x000000e2:     DW_TAG_formal_parameter
                    DW_AT_name    ("d")
                    DW_AT_type    (0x000000ba "t *")

  0x000000e8:     NULL

In the above, the DISubprogram 'foo' has the original signature but
since parameter 'b' does not have DW_AT_location, it is clear that
parameter will not be used. The actual function signature is represented
in DW_TAG_inlined_subroutine.

For the above case, it looks like DW_TAG_inlined_subroutine is not
necessary. Let us try a few other examples below.

Example 2
---------

Source:
  $ cat test.c
  struct t { long a; long b;};
  __attribute__((noinline)) static long foo(struct t arg) {
    return arg.b * 5;
  }
  long bar(struct t arg) {
    return foo(arg);
  }

Compiled and dump dwarf with:
  $ clang -O2 -c -g test.c
  $ llvm-dwarfdump test.o
  ...
  0x0000004e:   DW_TAG_subprogram
                  DW_AT_low_pc    (0x0000000000000010)
                  DW_AT_high_pc   (0x0000000000000015)
                  DW_AT_frame_base        (DW_OP_reg7 RSP)
                  DW_AT_call_all_calls    (true)
                  DW_AT_name      ("foo")
                  DW_AT_decl_file ("/home/yhs/tests/sig-change/struct/test.c")
                  DW_AT_decl_line (2)
                  DW_AT_prototyped        (true)
                  DW_AT_calling_convention        (DW_CC_nocall)
                  DW_AT_type      (0x0000006d "long")

  0x0000005e:     DW_TAG_formal_parameter
                    DW_AT_location        (DW_OP_piece 0x8, DW_OP_reg5 RDI, DW_OP_piece 0x8)
                    DW_AT_name    ("arg")
                    DW_AT_decl_file       ("/home/yhs/tests/sig-change/struct/test.c")
                    DW_AT_decl_line       (2)
                    DW_AT_type    (0x00000099 "t")

  0x0000006c:     NULL
  ...
  0x00000088:   DW_TAG_inlined_subroutine
                  DW_AT_name      ("foo")
                  DW_AT_type      (0x0000006d "long")
                  DW_AT_artificial        (true)
                  DW_AT_specification     (0x0000004e "foo")

  0x00000092:     DW_TAG_formal_parameter
                    DW_AT_name    ("arg")
                    DW_AT_type    (0x0000006d "long")

  0x00000098:     NULL

In the above case for function foo(), the original argument is 'struct t',
but the final actual argument is a 'long' type. DW_TAG_inlined_subroutine
can clearly represent the signature type instead of doing DW_AT_location
thing.

There is a problem in the above then, it is not clear what formal parameter
'arg' corresponds to the original parameter. If necessary, the compiler
could change 'arg' to e.g. 'arg_offset_8' to indicate it is 8 byte offset from
the original struct.

Example 3
---------

Source:
  $ cat test2.c
  struct t { long a; long b; long c;};
  __attribute__((noinline)) long foo(struct t arg) {
    return arg.a * arg.c;
  }
  long bar(struct t arg) {
    return foo(arg);
  }

Compiled and dump dwarf with:
  $ clang -O2 -c -g test2.c
  $ llvm-dwarfdump test2.o
  ...
  0x0000003e:   DW_TAG_subprogram
                  DW_AT_low_pc    (0x0000000000000010)
                  DW_AT_high_pc   (0x0000000000000015)
                  DW_AT_frame_base        (DW_OP_reg7 RSP)
                  DW_AT_call_all_calls    (true)
                  DW_AT_name      ("bar")
                  DW_AT_decl_file ("/home/yhs/tests/sig-change/struct/test2.c")
                  DW_AT_decl_line (5)
                  DW_AT_prototyped        (true)
                  DW_AT_type      (0x0000005f "long")
                  DW_AT_external  (true)

  0x0000004d:     DW_TAG_formal_parameter
                    DW_AT_location        (DW_OP_fbreg +8)
                    DW_AT_name    ("arg")
                    DW_AT_decl_file       ("/home/yhs/tests/sig-change/struct/test2.c")
                    DW_AT_decl_line       (5)
                    DW_AT_type    (0x00000079 "t")

  0x00000058:     DW_TAG_call_site
                    DW_AT_call_origin     (0x00000023 "foo")
                    DW_AT_call_tail_call  (true)
                    DW_AT_call_pc (0x0000000000000010)

  0x0000005e:     NULL
                ...
  0x00000063:   DW_TAG_inlined_subroutine
                  DW_AT_name      ("foo")
                  DW_AT_type      (0x0000005f "long")
                  DW_AT_artificial        (true)
                  DW_AT_specification     (0x00000023 "foo")

  0x0000006d:     DW_TAG_formal_parameter
                    DW_AT_name    ("arg")
                    DW_AT_type    (0x00000074 "t *")

  0x00000073:     NULL

In the above example, from DW_TAG_subprogram, it is not clear what kind
of type the parameter should be. But DW_TAG_inlined_subroutine can
clearly show what the type should be. Again, the name can be changed
e.g. 'arg_ptr' if desired.

Example 4
---------

Source:
  $ cat test.c
  __attribute__((noinline)) static int callee(const int *p) { return *p + 42; }
  int caller(void) {
    int x = 100;
    return callee(&x);
  }

Compiled and dump dwarf with:
  $ clang -O3 -c -g test2.c
  $ llvm-dwarfdump test2.o
  ...
  0x0000004a:   DW_TAG_subprogram
                  DW_AT_low_pc    (0x0000000000000010)
                  DW_AT_high_pc   (0x0000000000000014)
                  DW_AT_frame_base        (DW_OP_reg7 RSP)
                  DW_AT_call_all_calls    (true)
                  DW_AT_name      ("callee")
                  DW_AT_decl_file ("/home/yhs/tests/sig-change/prom/test.c")
                  DW_AT_decl_line (1)
                  DW_AT_prototyped        (true)
                  DW_AT_calling_convention        (DW_CC_nocall)
                  DW_AT_type      (0x00000063 "int")

  0x0000005a:     DW_TAG_formal_parameter
                    DW_AT_name    ("p")
                    DW_AT_decl_file       ("/home/yhs/tests/sig-change/prom/test.c")
                    DW_AT_decl_line       (1)
                    DW_AT_type    (0x00000078 "const int *")

  0x00000062:     NULL
                ...
  0x00000067:   DW_TAG_inlined_subroutine
                  DW_AT_name      ("callee")
                  DW_AT_type      (0x00000063 "int")
                  DW_AT_artificial        (true)
                  DW_AT_specification     (0x0000004a "callee")

  0x00000071:     DW_TAG_formal_parameter
                    DW_AT_name    ("__0")
                    DW_AT_type    (0x00000063 "int")

  0x00000077:     NULL

In the above, the function
  static int callee(const int *p) { return *p + 42; }
is transformed to
  static int callee(int p) { return p + 42; }
But the new signature is not reflected in DW_TAG_subprogram.
The DW_TAG_inlined_subroutine can precisely capture the
signature. Note that the parameter name is "__0" and "0" means
the first argument. The reason is due to the following IR:

  define internal ... i32 @callee(i32 %0) unnamed_addr llvm#1 !dbg !23 {
      #dbg_value(ptr poison, !29, !DIExpression(), !30)
    %2 = add nsw i32 %0, 42, !dbg !31
    ret i32 %2, !dbg !32
  }
  ...
  !29 = !DILocalVariable(name: "p", arg: 1, scope: !23, file: !1, line: 1, type: !26)

The reason is due to 'ptr poison' as 'ptr poison' mean the debug
value should not be used any more. This is also the reason that
the above DW_TAG_subprogram does not have location information.
DW_TAG_inlined_subroutine can provide correct signature though.

If we compile like below:
  clang -O3 -c -g test.c -fno-discard-value-names
The function argument name will be preserved
  ... i32 @callee(i32 %p.0.val) ...
and in such cases,
the DW_TAG_inlined_subroutine looks like below:

  0x00000067:   DW_TAG_inlined_subroutine
                  DW_AT_name      ("callee")
                  DW_AT_type      (0x00000063 "int")
                  DW_AT_artificial        (true)
                  DW_AT_specification     (0x0000004a "callee")

  0x00000071:     DW_TAG_formal_parameter
                    DW_AT_name    ("p__0__val")
                    DW_AT_type    (0x00000063 "int")

  0x00000077:     NULL
Note that the original argument name replaces '.' with "__"
so argument name has proper C standard.

Based a run on linux kernel, the names like "__<arg_index>"
roughly 2% of total signature changed functions, so we probably
okay for now.

Non-LTO vs. LTO
---------------

For thin-lto mode, we often see kernel symbols like
  p9_req_cache.llvm.13472271643223911678
Even if this symbol has identical source level signature with p9_req_cache,
a special DW_TAG_inlined_subroutine will be generated with
name 'p9_req_cache.llvm.13472271643223911678'.
With this, some tool (e.g., pahole) may generate a BTF entry
for this name which could be used for fentry/fexit tracing.

But if a symbol with "<foo>.llvm.<hash>" has different signatures
than the source level "<foo>", then a special DW_TAG_inlined_subroutine
will be generated like below:
  0x10f0793f:   DW_TAG_inlined_subroutine
                  DW_AT_name      ("flow_offload_fill_route")
                  DW_AT_linkage_name      ("flow_offload_fill_route.llvm.14555965973926298225")
                  DW_AT_artificial        (true)
                  DW_AT_specification     (0x10ee9e54 "flow_offload_fill_route")

  0x10f07949:     DW_TAG_formal_parameter
                    DW_AT_name    ("flow")
                    DW_AT_type    (0x10ee837a "flow_offload *")

  0x10f07951:     DW_TAG_formal_parameter
                    DW_AT_name    ("route")
                    DW_AT_type    (0x10eea4ef "nf_flow_route *")

  0x10f07959:     DW_TAG_formal_parameter
                    DW_AT_name    ("dir")
                    DW_AT_type    (0x10ecef15 "int")

  0x10f07961:     NULL

In the above, function "flow_offload_fill_route" has return type
"int" at source level, but optimization eventually made the return
type as "void". The tools like pahole may choice to generate
two entries with DW_AT_name and DW_AT_linkage_name for vmlinux BTF.

Note that it is possible one source symbol may have multiple linkage
name's due to potentially (more than one) cloning in llvm. In such
cases, multiple DW_TAG_inlined_subroutine instances might be possible.

Some restrictions
=================

There are some restrictions in the current implementation:
  - Only C language is supported
  - BPF target is excluded as one of main goals for this pull request
    is to generate proper vmlinux BTF for arch's like x86_64/arm64 etc.
  - Function must not be a intrinsic, decl only, return value size more
    than arch register size and func with variable arguments.
  - For arguments, only int/ptr types are supported.
  - Some union type arguments (e.g., 8B < union_size <= 16B) may
    have DIType issue so some function may be skipped.

Some statistics with linux kernel
=================================

I have tested this patch set by building latest bpf-next linux kernel.
For no-lto case:
  65341 original number of functions
  1054  signature changed functions with this patch
For thin-lto case:
  65595 original number of functions
  3150  signature changed functions with this patch

Next step
=========

With this llvm change, we will be able to do some work in pahole and libbpf.
For pahole, currently we will see the warning:
  die__process_unit: DW_TAG_inlined_subroutine (0x1d) @ <0xf2db986> not handled in a c11 CU!
Basically these DW_TAG_inlined_subroutine are not inside the DISubprogram.

  [1] llvm#127855
  [2] llvm#157349
  [3] https://discourse.llvm.org/t/rfc-identify-func-signature-change-in-llvm-compiled-kernel-image/82609
@yonghong-song
Copy link
Contributor Author

Just uploaded a new version which

  • disable emission of true signature by default. Adding option -mllvm -enable-changed-func-dbinfo to enable true signature generation.
  • Add more functions to true signature category including (1) functions having > 8 byte parameter, (2). any non-source function e.g. foo.llvm. even if foo has the same signature with foo.llvm.. and (3) any signature-changed functions. These changes make more function tracable by bpf programs esp. with fentry/fexit bpf programs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants