-
Notifications
You must be signed in to change notification settings - Fork 707
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
map: reduce allocations for batch operations #1513
Conversation
Signed-off-by: Florian Lehner <[email protected]>
Signed-off-by: Florian Lehner <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is an impressive speed up!
} | ||
|
||
// UnmarshalWithLimit unmarshals the buffer up to limit into the provided value. | ||
func (b Buffer) UnmarshalWithLimit(data any, limit int) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we instead have an API like:
// Shrink reduces the size of the buffer.
//
// Panics if size is less than 0 or if it is larger than the actual size.
func (b Buffer) Shrink(size int)
Then we can just do Buffer.Shrink(count * size)
unconditionally and don't need to check for count == 0
elsewhere.
Not exactly sure what to do for zero copy buffers? Nothing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Buffer.Shrink()
sounds like a permanent change to the buffer and I'm not sure about the benefits. It could free some memory some moments earlier (depending on when GC hits in).
The early check count == 0
here doesn't look expensive to me. With Buffer.Shrink(0 * size)
we would call sysenc.Unmarshal()
with a buf
of size 0
- here I would prefer the earlier return and avoid special case handling in sysenc.Unmarshal()
if len(buf) ==0
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I reverted the proposed changes and implemented the (Buffer) Shrink(..)
option as suggested and compared it against current main
(old.txt):
$ benchstat old.txt new.txt
name old time/op new time/op delta
Iterate/Hash/MapIterator-16 1.15ms ± 4% 1.13ms ± 5% ~ (p=0.190 n=9+9)
Iterate/Hash/MapIteratorDelete-16 1.73ms ± 2% 1.73ms ± 3% ~ (p=0.796 n=10+10)
Iterate/Hash/BatchLookup-16 2.62µs ±10% 2.64µs ±12% ~ (p=0.853 n=10+10)
Iterate/Hash/BatchLookupAndDelete-16 85.1µs ± 2% 85.9µs ± 2% ~ (p=0.247 n=10+10)
Iterate/Hash/BatchDelete-16 77.3µs ± 3% 75.4µs ± 2% -2.49% (p=0.004 n=10+10)
Iterate/PerCPUHash/MapIterator-16 3.21ms ±12% 3.22ms ± 7% ~ (p=0.912 n=10+10)
Iterate/PerCPUHash/MapIteratorDelete-16 6.18ms ±10% 6.23ms ±13% ~ (p=0.780 n=9+10)
Iterate/PerCPUHash/BatchLookup-16 2.33ms ± 8% 2.17ms ±16% ~ (p=0.243 n=9+10)
Iterate/PerCPUHash/BatchLookupAndDelete-16 3.18ms ±11% 3.06ms ± 5% ~ (p=0.218 n=10+10)
Iterate/PerCPUHash/BatchDelete-16 368µs ± 3% 365µs ± 4% ~ (p=0.529 n=10+10)
name old alloc/op new alloc/op delta
Iterate/Hash/MapIterator-16 24.1kB ± 0% 24.1kB ± 0% ~ (p=0.306 n=10+10)
Iterate/Hash/MapIteratorDelete-16 107B ± 0% 107B ± 0% ~ (all equal)
Iterate/Hash/BatchLookup-16 136B ± 0% 136B ± 0% ~ (all equal)
Iterate/Hash/BatchLookupAndDelete-16 144B ± 0% 144B ± 0% ~ (all equal)
Iterate/Hash/BatchDelete-16 24.0B ± 0% 24.0B ± 0% ~ (all equal)
Iterate/PerCPUHash/MapIterator-16 152kB ± 0% 152kB ± 0% -0.00% (p=0.003 n=9+10)
Iterate/PerCPUHash/MapIteratorDelete-16 281kB ± 0% 281kB ± 0% ~ (p=0.926 n=10+10)
Iterate/PerCPUHash/BatchLookup-16 155kB ± 0% 155kB ± 0% ~ (p=0.468 n=10+10)
Iterate/PerCPUHash/BatchLookupAndDelete-16 156kB ± 0% 156kB ± 0% ~ (p=0.134 n=10+10)
Iterate/PerCPUHash/BatchDelete-16 24.0B ± 0% 24.0B ± 0% ~ (all equal)
name old allocs/op new allocs/op delta
Iterate/Hash/MapIterator-16 1.00k ± 0% 1.00k ± 0% ~ (all equal)
Iterate/Hash/MapIteratorDelete-16 4.00 ± 0% 4.00 ± 0% ~ (all equal)
Iterate/Hash/BatchLookup-16 5.00 ± 0% 5.00 ± 0% ~ (all equal)
Iterate/Hash/BatchLookupAndDelete-16 5.00 ± 0% 5.00 ± 0% ~ (all equal)
Iterate/Hash/BatchDelete-16 1.00 ± 0% 1.00 ± 0% ~ (all equal)
Iterate/PerCPUHash/MapIterator-16 2.00k ± 0% 2.00k ± 0% ~ (all equal)
Iterate/PerCPUHash/MapIteratorDelete-16 3.00k ± 0% 3.00k ± 0% ~ (all equal)
Iterate/PerCPUHash/BatchLookup-16 1.01k ± 0% 1.01k ± 0% ~ (all equal)
Iterate/PerCPUHash/BatchLookupAndDelete-16 1.01k ± 0% 1.01k ± 0% ~ (all equal)
Iterate/PerCPUHash/BatchDelete-16 1.00 ± 0% 1.00 ± 0% ~ (all equal)
From this I have learned, that the Unmarshal()
function in (Buffer) Unmarshal()
causes allocations, even if there are 0 elements to unmarshal. Therefore, I suggest to drop the map: allow partial unmarshaling commit and just go with the early return via map: skip unmarshal if attr.Count == 0. WDYT @lmb ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess that is some weird interaction between resizing the buffer and
ebpf/internal/sysenc/marshal.go
Line 98 in ccdd12c
if dataBuf := unsafeBackingMemory(data); len(dataBuf) == len(buf) { |
Is the special case for count == 0
actually important in practice? I'd imagine we only hit it at the end of a batch iteration?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah - count == 0
looks like a special case that might be hit at the end of batch operations, depending on the batch size and the number of elements in the map.
For the moment, I think it is best to close this.
Reduce allocations for batch operations by
This should help to resolve #1080.