Skip to content

Conversation

@shun159
Copy link
Contributor

@shun159 shun159 commented Aug 10, 2025

This PR adds preliminary support for StructOpsMap and StructOps.

  • Populates kernVData with program FDs during map finalization.
  • Extends map.go and prog.go to support BTF attach information and module BTF.
  • Add unit tests in collection_test.go and struct_ops_test.go to verify basic load with a hand-crafted MapSpec.

see: #1502

@shun159 shun159 requested a review from a team as a code owner August 10, 2025 13:22
@shun159 shun159 force-pushed the feature/struct-ops-2 branch from 7b8a348 to f7a32d3 Compare August 10, 2025 13:27
Copy link
Collaborator

@lmb lmb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR is missing the part where program fds are inserted into the struct ops map. Without that it's hard to tell what is going on.

@shun159
Copy link
Contributor Author

shun159 commented Aug 11, 2025

This PR is missing the part where program fds are inserted into the struct ops map. Without that it's hard to tell what is going on.

@lmb yeah, agreed.
I was trying to keep the PR small, but I see how it’s unclear without kern_vdata population.
then, let's proceed with including the changes so that make it clear.

@shun159
Copy link
Contributor Author

shun159 commented Aug 15, 2025

vimto tests on 6.1 and 6.6 has passed on my setup, so I feel this is a temporary error.

Copy link
Collaborator

@lmb lmb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had a good look at the PR and there are some high level problems we need to solve:

  1. Too much metadata which is kept on the side and not in the right places. This makes the logic hard to follow and doesn't integrate well with the rest of the code. The solution is to duplicate some metadata by storing type name and member in ProgramSpec.AttachTo. We also need to store the mapping from MapSpec.Value.Member[].Name to program somehow. For now I think we can get away with setting ProgramSpec.Name to MapSpec.Value.Member[].Name.
  2. There is a lot of duplication in helper functions that do unconventional stuff with btf types. Most of these should go. Comparisons between types to find the equivalent should always be using the concrete type and name, not offset or index.

@shun159 shun159 force-pushed the feature/struct-ops-2 branch 2 times, most recently from 4126573 to 0e82c52 Compare September 14, 2025 14:18
@lmb
Copy link
Collaborator

lmb commented Sep 16, 2025

@shun159 please ping me explicitly when you need another review.

@shun159
Copy link
Contributor Author

shun159 commented Sep 16, 2025

@shun159 please ping me explicitly when you need another review.

Hi @lmb I believe I've addressed most of your feedbacks earlier.
However, after switching to use AttachTo, it looks like elf_reader_test is now failing.
My assumption is that if the AttachTo string doesn't match the expected format introduced by this PR, we could simply ignore it, in which case the test itself should pass but I would appreciate your thought how we should proceed with this?

https://github.com/cilium/ebpf/actions/runs/17712680837/job/50333512472?pr=1845

Copy link
Collaborator

@lmb lmb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks pretty good! There are a couple of comments which you didn't address. Please go through the conversations and resolve them if you've actioned them.

To fix the CI breakage: for now it's ok to set AttachTo = "" if were dealing with a struct ops program in TestLibBPFCompat.

Copy link
Collaborator

@lmb lmb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks pretty good! There are a couple of comments which you didn't address. Please go through the conversations and resolve them if you've actioned them.

To fix the CI breakage: for now it's ok to set AttachTo = "" if were dealing with a struct ops program in TestLibBPFCompat.

@lmb
Copy link
Collaborator

lmb commented Sep 16, 2025

Please rebase on top of #1865. The end result will be that you shouldn't have to call btf.LoadKernelSpec explicitly anymore.

@shun159 shun159 force-pushed the feature/struct-ops-2 branch from b19ded9 to f5f64b2 Compare September 17, 2025 11:09
@shun159
Copy link
Contributor Author

shun159 commented Sep 17, 2025

Please rebase on top of #1865. The end result will be that you shouldn't have to call btf.LoadKernelSpec explicitly anymore.

done

@lmb lmb changed the title struct_ops: add structOpsMeta to carry BTF hints for StructOpsMap creation add StructOpsMap support Sep 18, 2025
@shun159
Copy link
Contributor Author

shun159 commented Sep 18, 2025

@lmb Since the comment disappeared, I’ll respond by quoting below.

> +		return nil, fmt.Errorf("member %s not found in %s", innerName, to.Name)
+	}
+
+	kernIndexByName := make(map[string]btf.Member, len(inner.Members))
+	for _, m := range inner.Members {
+		kernIndexByName[m.Name] = m
+	}
+
+	for _, m := range from.Members {
+		if m.BitfieldSize > 0 {
+			return nil, fmt.Errorf("bitfield %s not supported", m.Name)
+		}
+
+		kernMember, ok := kernIndexByName[m.Name]
+		if !ok {
+			continue
Is this what libbpf does as well? Seems a bit prone to breakage.
> +
+		if kernMember.BitfieldSize > 0 {
+			return nil, fmt.Errorf("bitfield %s not supported in kern struct", kernMember.Name)
+		}
+
+		sz, err := btf.Sizeof(m.Type)
+		if err != nil {
+			return nil, fmt.Errorf("failed to resolve size of %s: %w", m.Name, err)
+		}
+
+		kernSz, err := btf.Sizeof(kernMember.Type)
+		if err != nil {
+			return nil, fmt.Errorf("failed to resolve size of %s: %w", kernMember.Name, err)
+		}
+
+		if sz != kernSz {
This check is pretty basic. Is that all that libbpf does as well?

Might make sense to call btf.CheckTypeCompatibility if libbpf does something similar.

From my understanding, libbpf seems to handle it as follows:

  1. if a field in the “user struct” is not found in the kernel struct and its value is non-zero, it results in an error.
  2. explicitly checks for matching type kinds.

It may be need to keep the parsed result of the ELF somewhere.

@shun159
Copy link
Contributor Author

shun159 commented Sep 18, 2025

At the moment, we only have a test that writes progFD into kern_vdata, but please let me also add some test to check whether scalar values can be written.

@shun159 shun159 force-pushed the feature/struct-ops-2 branch from 0072f5c to 8349ed9 Compare September 21, 2025 05:51
@shun159
Copy link
Contributor Author

shun159 commented Sep 21, 2025

@lmb Since the comment disappeared, I’ll respond by quoting below.

> +		return nil, fmt.Errorf("member %s not found in %s", innerName, to.Name)
+	}
+
+	kernIndexByName := make(map[string]btf.Member, len(inner.Members))
+	for _, m := range inner.Members {
+		kernIndexByName[m.Name] = m
+	}
+
+	for _, m := range from.Members {
+		if m.BitfieldSize > 0 {
+			return nil, fmt.Errorf("bitfield %s not supported", m.Name)
+		}
+
+		kernMember, ok := kernIndexByName[m.Name]
+		if !ok {
+			continue
Is this what libbpf does as well? Seems a bit prone to breakage.
> +
+		if kernMember.BitfieldSize > 0 {
+			return nil, fmt.Errorf("bitfield %s not supported in kern struct", kernMember.Name)
+		}
+
+		sz, err := btf.Sizeof(m.Type)
+		if err != nil {
+			return nil, fmt.Errorf("failed to resolve size of %s: %w", m.Name, err)
+		}
+
+		kernSz, err := btf.Sizeof(kernMember.Type)
+		if err != nil {
+			return nil, fmt.Errorf("failed to resolve size of %s: %w", kernMember.Name, err)
+		}
+
+		if sz != kernSz {
This check is pretty basic. Is that all that libbpf does as well?

Might make sense to call btf.CheckTypeCompatibility if libbpf does something similar.

From my understanding, libbpf seems to handle it as follows:

  1. if a field in the “user struct” is not found in the kernel struct and its value is non-zero, it results in an error.
  2. explicitly checks for matching type kinds.

It may be need to keep the parsed result of the ELF somewhere.

@lmb
From the VAR section, we can obtain information about each field of the “user struct” as shown below. By passing such section to CollectionSpec or similar, it seems possible to achieve the same functionality. I’d appreciate your thoughts on this.

(It can be store this in the MapKV of each structops map)

[10] VAR 'null_sched_init.____fmt' type_id=9, linkage=static
[11] STRUCT 'sched_ext_ops' size=424 vlen=39
        'select_cpu' type_id=12 bits_offset=0
        'enqueue' type_id=21 bits_offset=64
        'dequeue' type_id=21 bits_offset=128
        'dispatch' type_id=23 bits_offset=192
        'tick' type_id=25 bits_offset=256
        'runnable' type_id=21 bits_offset=320
        'running' type_id=25 bits_offset=384
        'stopping' type_id=27 bits_offset=448
        'quiescent' type_id=21 bits_offset=512
        'yield' type_id=29 bits_offset=576
        'core_sched_before' type_id=29 bits_offset=640
        'set_weight' type_id=31 bits_offset=704
        'set_cpumask' type_id=36 bits_offset=768
        'update_idle' type_id=41 bits_offset=832
        'cpu_acquire' type_id=43 bits_offset=896
        'cpu_release' type_id=47 bits_offset=960
        'init_task' type_id=51 bits_offset=1024
        'exit_task' type_id=55 bits_offset=1088
        'enable' type_id=25 bits_offset=1152
        'disable' type_id=25 bits_offset=1216
        'dump' type_id=59 bits_offset=1280
        'dump_cpu' type_id=63 bits_offset=1344
        'dump_task' type_id=65 bits_offset=1408
        'cgroup_init' type_id=67 bits_offset=1472
        'cgroup_exit' type_id=73 bits_offset=1536
        'cgroup_prep_move' type_id=75 bits_offset=1600
        'cgroup_move' type_id=77 bits_offset=1664
        'cgroup_cancel_move' type_id=77 bits_offset=1728
        'cgroup_set_weight' type_id=79 bits_offset=1792
        'cpu_online' type_id=81 bits_offset=1856
        'cpu_offline' type_id=81 bits_offset=1920
        'init' type_id=83 bits_offset=1984
        'exit' type_id=85 bits_offset=2048
        'dispatch_max_batch' type_id=33 bits_offset=2112
        'flags' type_id=18 bits_offset=2176
        'timeout_ms' type_id=33 bits_offset=2240
        'exit_dump_len' type_id=33 bits_offset=2272
        'hotplug_seq' type_id=18 bits_offset=2304
        'name' type_id=89 bits_offset=2368

@lmb
Copy link
Collaborator

lmb commented Sep 22, 2025

My idea is to place STRUCT 'sched_ext_ops' into MapSpec.Value. Would that not work?

@shun159
Copy link
Contributor Author

shun159 commented Sep 22, 2025

My idea is to place STRUCT 'sched_ext_ops' into MapSpec.Value. Would that not work?

@lmb Ah! I see, does something like the following match what you have in mind?
(This is an scx example; apologies if any details aren’t exact)

ms := &MapSpec{
    Name:  "scx_null",
    Type:  StructOpsMap,
    Value: &btf.Struct{
        Name: "sched_ext_ops",
        // ... parsed from the ELF reader
        Members: []Member{
            {Name: "init", Type: ..., Offset: ...., }
            .
            .
        },
    },
}

I think this would work, however, in practice the map data will be a bytes of the value type, so in this change I used the value type name for Value. In the example above we’d be putting the user struct there, which I think could be a bit confusing.

@shun159
Copy link
Contributor Author

shun159 commented Sep 23, 2025

From bpf_map__init_kern_struct_ops, it appears that libbpf uses the BTF generated from the ELF for type validation and for offsets when copying, etc. For example, for type-size validation, it seems to read and check against the BTF corresponding to what loadSpecFromELF(file) would produce as btf.Spec.

Therefore, I think we need a way to pass the BTF generated from the ELF into NewCollectionWithOptions(). It’s probably not sufficient to pass struct members.
I’d appreciate your thoughts.

@shun159
Copy link
Contributor Author

shun159 commented Sep 26, 2025

My idea is to place STRUCT 'sched_ext_ops' into MapSpec.Value. Would that not work?

@lmb done. it works.

@shun159
Copy link
Contributor Author

shun159 commented Sep 29, 2025

@lmb Please review it when you have time.

shun159 added a commit to shun159/ebpf that referenced this pull request Sep 29, 2025
This commit adds struct_ops support to the ELF reader: it classifies non-executable PROGBITS sections,
parses their BTF Datasec to build MapSpecs, associates relocs with func-pointer members to
set ps.AttachTo, and adds TestStructOps.

Related: cilium#1845

Signed-off-by: shun159 <[email protected]>
@shun159
Copy link
Contributor Author

shun159 commented Oct 1, 2025

From bpf_map__init_kern_struct_ops, it appears that libbpf uses the BTF generated from the ELF for type validation and for offsets when copying, etc. For example, for type-size validation, it seems to read and check against the BTF corresponding to what loadSpecFromELF(file) would produce as btf.Spec.

Therefore, I think we need a way to pass the BTF generated from the ELF into NewCollectionWithOptions(). It’s probably not sufficient to pass struct members. I’d appreciate your thoughts.

Correction: this is not needed. copying userStruct into ms.Value will be sufficient.

@lmb
Copy link
Collaborator

lmb commented Oct 10, 2025

CI failures are probably because the PR hasn't been rebased, not exactly sure.

@lmb lmb force-pushed the feature/struct-ops-2 branch from 49faf10 to a87298a Compare October 15, 2025 10:39
@lmb
Copy link
Collaborator

lmb commented Oct 15, 2025

Squashed and rebased.

lmb
lmb previously approved these changes Oct 15, 2025
@lmb
Copy link
Collaborator

lmb commented Oct 15, 2025

I need to rebase this once #1879 is in so that I can test that it doesn't break windows.

Copy link
Member

@dylandreimerink dylandreimerink left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry to be a bit late to the party. I did find a few issues that should be addresses.

Most of those are comments on the code. Additionally I think PR #1869 is an integral part of this work. It is only with both PRs that you can go from ELF to loading a program.

I had to create a local branch and merge both PR branches into it before I could try loading some of the kernel self test programs, where I found some the issues pointed out.

I mostly tested with https://elixir.bootlin.com/linux/v6.17.1/source/tools/testing/selftests/bpf/progs/tcp_ca_write_sk_pacing.c

Comment on lines +389 to +394
if spec.Type == StructOps {
attachTo, targetMember, _ = strings.Cut(attachTo, ":")
if targetMember == "" {
return nil, fmt.Errorf("struct_ops: AttachTo must be '<ops>:<member>' (got %s)", spec.AttachTo)
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not consistent with the standard that libbpf sets out. In this implementation a user is required to make a section like so:

SEC("struct_ops/tcp_congestion_ops:undo_cwnd")

But the libbpf standard is that the name after the struct_ops/ does not matter. Instead, the structure and member are based on how a function is used in the map value:

SEC("struct_ops/write_sk_pacing_undo_cwnd")
__u32 BPF_PROG(write_sk_pacing_undo_cwnd, struct sock *sk)
{
	return tcp_sk(sk)->snd_cwnd;
}

SEC(".struct_ops")
struct tcp_congestion_ops write_sk_pacing = {
	.undo_cwnd = (void *)write_sk_pacing_undo_cwnd,
	.name = "bpf_w_sk_pacing",
};

So we to should use the map value BTF to figure out this info so that we are ELF compatible with libbpf.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Discussed this with @shun159 before: the ELF reader will fudge up AttachTo to conform to the format we expect based on the struct definition. We'll throw away the section information.
  • The map value BTF is not enough to capture this, the information is only in relocation entries.

The design constraint here is that we need to know the struct ops type name and member name when loading the program (sigh), so this info has to be in ProgramSpec.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ack. I guess those changes are just not #1869 yet which is what I was testing with. It currently just passes the section straight through. If we do want to keep this as two separate PRs we should ensure that gets fixed there.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@shun159 please update the ELF reader PR based on this.

Comment on lines +792 to +799
// Populate the map explicitly and keep a reference on cl.programs.
// This is necessary because we may inline fds into kernVData which
// may become invalid if the GC frees them.
if err := m.Put(uint32(0), kernVData); err != nil {
return err
}
mapSpec.Contents = nil
runtime.KeepAlive(cl.programs)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This creates really unintuitive/unexpected behavior. The act of writing a value to a struct_ops_map that has not been created with the F_LINK flag attaches it. So the moment you run ebpf.NewCollection(spec), you have attached.

Even worse, when attached the refcount of the map is incremented. When you call coll.Close() the map value is not deleted, and thus the struct ops programs currently stay attached.

That is not expected by users, since for all other program types you need to explicitly attach using the link package.

I do not think this is necessarily a bad approach, but only if we always load the struct ops map with F_LINK. Loading with F_LINK makes it so writing to the map does not do anything, and attaching is managed via an actual BPF link.

Caveats are that F_LINK was added to the kernel later on, so we would limit the minimum kernel version requirements. And also normally this is user controllable by specifying SEC("struct_ops") vs SEC("struct_ops.link").

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great catch! I think for now we should force BPF_F_LINK by only allowing the ELF reader to consume .link sections. Tests should also only do F_LINK. I don't see how to make the non-link semantics work, they are just too broken / special. Maybe people need to "manually" create such maps if they need further backwards compat.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unit tests already use BPF_F_LINK, so no change needed here.

Copy link
Member

@dylandreimerink dylandreimerink Oct 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think for now we should force BPF_F_LINK by only allowing the ELF reader to consume .link sections. Tests should also only do F_LINK. I don't see how to make the non-link semantics work, they are just too broken / special. Maybe people need to "manually" create such maps if they need further backwards compat.

Yes, agreed. Lets throw an error if we do not see BPF_F_LINK when loading the map. Worst case people will need the add the flag to the spec manually in Go.

If we do go that route, we are missing the logic to create the link. We can add that in a followup PR. So this PR adds the logic to load a spec. #1869 has the logic to go from ELF to spec. And some future PR will add the link logic to attach the loaded program/map. Sound good?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup that is the plan.

Add preliminary support for struct_ops maps. The MapSpec of a
StructOps map has to follow a particular format:

- MapSpec.Value: contains a Struct which matches the in-kernel
  struct. The name of the type is used to find the in-kernel
  equivalent.
- MapSpec.Contents[0]: contains the default values for
  non-function-pointer fields.

Programs inserted into a StructOpsMap require special treatment
as well. During load we need to specify which struct and field
member to attach to. For this purpose we overload
ProgramSpec.AttachTo: it must contain a string in the form
"type_name:field_name".

This commit does not yet enable full struct_ops support since
we are still missing changes to the ELF reader and package link.

See: cilium#1502
Signed-off-by: shun159 <[email protected]>
Co-developed-by: Lorenz Bauer <[email protected]>
@lmb lmb force-pushed the feature/struct-ops-2 branch from a87298a to 4250dab Compare October 16, 2025 14:33
@lmb lmb requested a review from dylandreimerink October 16, 2025 14:51
@lmb lmb dismissed dylandreimerink’s stale review October 16, 2025 15:40

Discussed in PR review

@lmb lmb merged commit 8f23ed6 into cilium:main Oct 16, 2025
30 of 33 checks passed
@shun159
Copy link
Contributor Author

shun159 commented Oct 16, 2025

thx @lmb and @dylandreimerink for your kind support!

shun159 added a commit to shun159/ebpf that referenced this pull request Oct 16, 2025
This commit adds struct_ops support to the ELF reader: it classifies non-executable PROGBITS sections,
parses their BTF Datasec to build MapSpecs, associates relocs with func-pointer members to
set ps.AttachTo, and adds TestStructOps.

Related: cilium#1845

Signed-off-by: shun159 <[email protected]>
@shun159 shun159 deleted the feature/struct-ops-2 branch October 16, 2025 15:54
shun159 added a commit to shun159/ebpf that referenced this pull request Oct 27, 2025
This commit adds struct_ops support to the ELF reader: it classifies non-executable PROGBITS sections,
parses their BTF Datasec to build MapSpecs, associates relocs with func-pointer members to
set ps.AttachTo, and adds TestStructOps.

Related: cilium#1845

Signed-off-by: shun159 <[email protected]>
lmb pushed a commit to shun159/ebpf that referenced this pull request Oct 31, 2025
This commit adds struct_ops support to the ELF reader: it classifies non-executable PROGBITS sections,
parses their BTF Datasec to build MapSpecs, associates relocs with func-pointer members to
set ps.AttachTo, and adds TestStructOps.

Related: cilium#1845

Signed-off-by: shun159 <[email protected]>
lmb pushed a commit that referenced this pull request Oct 31, 2025
This commit adds struct_ops support to the ELF reader: it classifies non-executable PROGBITS sections,
parses their BTF Datasec to build MapSpecs, associates relocs with func-pointer members to
set ps.AttachTo, and adds TestStructOps.

Related: #1845

Signed-off-by: shun159 <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants