perf: Suppress telemetry using ContextFlags(usize) instead of bool #2861

bantonsson · 2025-03-25T14:43:41Z

Changes

This PR aligns the Context struct and changes a bool into a flag field. It tries to mitigate the performance impact of #2821 on context attach/detach operations.

Merge requirement checklist

CONTRIBUTING guidelines followed
Unit tests added/updated (if applicable)
Appropriate CHANGELOG.md files updated for non-trivial, user-facing changes
Changes in public API reviewed (if applicable)

codecov · 2025-03-25T14:47:40Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 80.6%. Comparing base (353bbb0) to head (55cbad0).

Additional details and impacted files

@@          Coverage Diff          @@
##            main   #2861   +/-   ##
=====================================
  Coverage   80.6%   80.6%           
=====================================
  Files        126     126           
  Lines      22195   22201    +6     
=====================================
+ Hits       17898   17904    +6     
  Misses      4297    4297

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

scottgerring · 2025-03-26T12:06:59Z

👷 build bot output from this run:

name	baseDuration	changesDuration	difference
context_attach/nested_cx/empty_cx	31.7±0.14ns	19.1±0.27ns	-40
context_attach/nested_cx/single_value_cx	33.7±0.22ns	19.4±0.44ns	-42
context_attach/nested_cx/span_cx	33.0±0.29ns	19.4±0.30ns	-41
context_attach/out_of_order_cx_drop/empty_cx	38.7±0.18ns	19.8±0.19ns	-49
context_attach/out_of_order_cx_drop/single_value_cx	40.1±0.13ns	20.8±0.48ns	-48
context_attach/out_of_order_cx_drop/span_cx	39.7±0.22ns	20.8±0.32ns	-48
context_attach/single_cx/empty_cx	18.1±0.22ns	12.5±0.31ns	-31
context_attach/single_cx/single_value_cx	17.6±0.08ns	12.5±0.13ns	-29
context_attach/single_cx/span_cx	17.3±0.13ns	12.5±0.25ns	-28

bantonsson · 2025-03-27T10:25:19Z

~~These are the benchmark numbers from this run:~~

These numbers are not relevant since the code has changed completely.

bantonsson · 2025-03-28T14:59:46Z

This is still a draft until #2870 has been merged and all benchmarks are run properly.

bantonsson · 2025-04-04T07:39:40Z

New performance numbers 2025-05-06 from this run:

name	baseDuration	changesDuration	difference
`context/has_active_span/in-cx/alt`	`8.4±0.04ns`	`8.4±0.07ns`	`0.0`
`context/has_active_span/in-cx/spec`	`5.0±0.12ns`	`5.0±0.12ns`	`0.0`
`context/has_active_span/no-cx/alt`	`8.4±0.02ns`	`8.4±0.03ns`	`0.0`
`context/has_active_span/no-cx/spec`	`5.0±0.23ns`	`4.7±0.14ns`	`-6.5`
`context/has_active_span/no-sdk/alt`	`8.4±0.05ns`	`8.4±0.03ns`	`0.0`
`context/has_active_span/no-sdk/spec`	`5.0±0.31ns`	`4.7±0.34ns`	`-6.5`
`context/is_recording/in-cx/alt`	`4.7±0.14ns`	`4.7±0.14ns`	`0.0`
`context/is_recording/in-cx/spec`	`7.5±0.28ns`	`7.5±0.30ns`	`0.0`
`context/is_recording/no-cx/alt`	`4.7±0.14ns`	`4.7±0.19ns`	`0.0`
`context/is_recording/no-cx/spec`	`7.2±0.26ns`	`7.2±0.14ns`	`0.0`
`context/is_recording/no-sdk/alt`	`4.7±0.14ns`	`4.7±0.10ns`	`0.0`
`context/is_recording/no-sdk/spec`	`7.2±0.30ns`	`7.2±0.25ns`	`0.0`
`context/is_sampled/in-cx/alt`	`8.7±0.04ns`	`8.7±0.04ns`	`0.0`
`context/is_sampled/in-cx/spec`	`5.4±0.13ns`	`5.3±0.16ns`	`-0.99`
`context/is_sampled/no-cx/alt`	`8.7±0.04ns`	`8.7±0.28ns`	`0.0`
`context/is_sampled/no-cx/spec`	`5.0±0.29ns`	`5.0±0.37ns`	`0.0`
`context/is_sampled/no-sdk/alt`	`8.7±0.06ns`	`8.7±0.02ns`	`0.0`
`context/is_sampled/no-sdk/spec`	`5.0±0.18ns`	`5.0±0.24ns`	`0.0`
`context_attach/nested_cx/empty_cx`	`47.4±1.14ns`	`39.3±2.27ns`	`-17`
`context_attach/nested_cx/single_value_cx`	`48.4±1.10ns`	`42.8±1.22ns`	`-12`
`context_attach/nested_cx/span_cx`	`48.3±0.68ns`	`42.6±1.47ns`	`-12`
`context_attach/out_of_order_cx_drop/empty_cx`	`41.6±0.49ns`	`40.7±1.16ns`	`-2.0`
`context_attach/out_of_order_cx_drop/single_value_cx`	`42.5±3.01ns`	`42.3±2.05ns`	`0.0`
`context_attach/out_of_order_cx_drop/span_cx`	`42.7±1.23ns`	`42.1±1.82ns`	`-0.99`
`context_attach/single_cx/empty_cx`	`24.1±0.43ns`	`19.5±0.67ns`	`-19`
`context_attach/single_cx/single_value_cx`	`23.9±2.92ns`	`23.4±0.60ns`	`-2.0`
`context_attach/single_cx/span_cx`	`23.1±0.58ns`	`23.1±0.63ns`	`0.0`
`telemetry_suppression/enter_telemetry_suppressed_scope`	`27.2±0.67ns`	`25.2±0.27ns`	`-7.4`
`telemetry_suppression/is_current_telemetry_suppressed_false`	`1.4±0.02ns`	`1.3±0.05ns`	`-5.7`
`telemetry_suppression/is_current_telemetry_suppressed_true`	`1.4±0.02ns`	`1.3±0.02ns`	`-6.5`
`telemetry_suppression/normal_attach`	`30.5±0.63ns`	`28.2±1.04ns`	`-7.4`

cijothomas · 2025-04-04T15:25:33Z

@bantonsson can you run the bench in your machine and see if you are also observing the same? I am seeing regression in my laptop. There are improvements to attach ones anyway, so we should still proceed with this PR, but I am curious how much we can trust the bench results from the CI machines!

telemetry_suppression/enter_telemetry_suppressed_scope
time: [10.170 ns 10.198 ns 10.224 ns]
change: [+8.3229% +8.8793% +9.4188%] (p = 0.00 < 0.05)
Performance has regressed.
Found 6 outliers among 100 measurements (6.00%)
2 (2.00%) low severe
3 (3.00%) low mild
1 (1.00%) high severe
telemetry_suppression/normal_attach
time: [11.386 ns 11.440 ns 11.495 ns]
change: [+9.0850% +9.5688% +10.094%] (p = 0.00 < 0.05)
Performance has regressed.
Found 9 outliers among 100 measurements (9.00%)
7 (7.00%) low mild
2 (2.00%) high mild
telemetry_suppression/is_current_telemetry_suppressed_false
time: [729.10 ps 731.32 ps 733.54 ps]
change: [-2.4232% -1.9738% -1.5230%] (p = 0.00 < 0.05)
Performance has improved.
Found 20 outliers among 100 measurements (20.00%)
4 (4.00%) low severe
8 (8.00%) low mild
4 (4.00%) high mild
4 (4.00%) high severe
telemetry_suppression/is_current_telemetry_suppressed_true
time: [730.30 ps 732.13 ps 734.16 ps]
change: [-1.7477% -1.2922% -0.7619%] (p = 0.00 < 0.05)
Change within noise threshold.

bantonsson · 2025-04-07T07:44:21Z

@cijothomas These are the numbers from my M1 Max Laptop, where I see a slight regression in checks and large improvements in entering.

Benchmarking telemetry_suppression/enter_telemetry_suppressed_scope: Collecting 100 samples in estimated 2.0001 s (135M itelemetry_suppression/enter_telemetry_suppressed_scope
                        time:   [14.479 ns 14.511 ns 14.549 ns]
                        change: [-21.316% -20.968% -20.611%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 11 outliers among 100 measurements (11.00%)
  2 (2.00%) low mild
  2 (2.00%) high mild
  7 (7.00%) high severe
telemetry_suppression/normal_attach
                        time:   [15.431 ns 15.483 ns 15.550 ns]
                        change: [-18.631% -18.300% -17.951%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 12 outliers among 100 measurements (12.00%)
  6 (6.00%) low mild
  1 (1.00%) high mild
  5 (5.00%) high severe
Benchmarking telemetry_suppression/is_current_telemetry_suppressed_false: Collecting 100 samples in estimated 2.0000 s (1telemetry_suppression/is_current_telemetry_suppressed_false
                        time:   [1.1357 ns 1.1436 ns 1.1545 ns]
                        change: [+2.1229% +2.7044% +3.3306%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 2 outliers among 100 measurements (2.00%)
  1 (1.00%) high mild
  1 (1.00%) high severe
Benchmarking telemetry_suppression/is_current_telemetry_suppressed_true: Collecting 100 samples in estimated 2.0000 s (1.telemetry_suppression/is_current_telemetry_suppressed_true
                        time:   [1.1311 ns 1.1367 ns 1.1438 ns]
                        change: [+1.1998% +1.6296% +2.0577%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 23 outliers among 100 measurements (23.00%)
  2 (2.00%) low severe
  3 (3.00%) low mild
  4 (4.00%) high mild
  14 (14.00%) high severe

And these are the numbers from my AMD Ryzen 5 3600 2.2GHz box, where I see a slight improvement in checks and large improvements in entering.

Benchmarking telemetry_suppression/enter_telemetry_suppressed_scope: Collecting 100 samples in estimated 2.0001 s (85M iteratio
telemetry_suppression/enter_telemetry_suppressed_scope
                        time:   [23.078 ns 23.081 ns 23.083 ns]
                        change: [-16.623% -16.588% -16.558%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 3 outliers among 100 measurements (3.00%)
  2 (2.00%) high mild
  1 (1.00%) high severe
telemetry_suppression/normal_attach
                        time:   [24.008 ns 24.024 ns 24.039 ns]
                        change: [-17.823% -17.631% -17.412%] (p = 0.00 < 0.05)
                        Performance has improved.
Benchmarking telemetry_suppression/is_current_telemetry_suppressed_false: Collecting 100 samples in estimated 2.0000 s (2.8B it
telemetry_suppression/is_current_telemetry_suppressed_false
                        time:   [715.65 ps 715.73 ps 715.82 ps]
                        change: [-1.7774% -1.7447% -1.7127%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 7 outliers among 100 measurements (7.00%)
  1 (1.00%) low mild
  3 (3.00%) high mild
  3 (3.00%) high severe
Benchmarking telemetry_suppression/is_current_telemetry_suppressed_true: Collecting 100 samples in estimated 2.0000 s (2.8B ite
telemetry_suppression/is_current_telemetry_suppressed_true
                        time:   [715.54 ps 715.63 ps 715.73 ps]
                        change: [-3.6408% -3.0627% -2.5393%] (p = 0.00 < 0.05)
                        Performance has improved.

bantonsson · 2025-05-06T16:39:50Z

The benchmark numbers comment have been updated with the latest run.

bantonsson · 2025-05-14T12:01:04Z

It would be great if there could be some extra manual testing of this @cijothomas @lalitb so we can come to a conclusion.

cijothomas · 2025-05-14T14:35:42Z

telemetry_suppression/enter_telemetry_suppressed_scope
                        time:   [10.502 ns 10.524 ns 10.546 ns]
                        change: [+12.347% +12.742% +13.119%] (p = 0.00 < 0.05)
                        Performance has regressed.
telemetry_suppression/normal_attach
                        time:   [11.106 ns 11.133 ns 11.166 ns]
                        change: [+8.4063% +8.8465% +9.3006%] (p = 0.00 < 0.05)
                        Performance has regressed.
telemetry_suppression/is_current_telemetry_suppressed_false
                        time:   [743.58 ps 744.81 ps 746.44 ps]
                        change: [-0.6749% -0.3983% -0.1260%] (p = 0.00 < 0.05)
                        Change within noise threshold.
Found 21 outliers among 100 measurements (21.00%)
  8 (8.00%) low severe
  3 (3.00%) low mild
  3 (3.00%) high mild
  7 (7.00%) high severe
telemetry_suppression/is_current_telemetry_suppressed_true
                        time:   [743.35 ps 744.08 ps 745.10 ps]
                        change: [-0.4459% -0.1688% +0.0707%] (p = 0.22 > 0.05)
                        No change in performance detected.
Found 20 outliers among 100 measurements (20.00%)
  3 (3.00%) low severe
  6 (6.00%) low mild
  5 (5.00%) high mild
  6 (6.00%) high severe

I'm seeing consistent ~10% regression for enter, no-change for checks.

bantonsson · 2025-05-14T15:30:59Z

That is interesting @cijothomas. What hardware are you running on?

cijothomas · 2025-05-14T22:28:24Z

That is interesting @cijothomas. What hardware are you running on?

This was in m4 pro. I can try in a windows box and update back.

cijothomas · 2025-05-22T15:42:56Z

@lalitb @utpilla could you also help run the benches for this to see if there is consistent improvement/regression?

lalitb · 2025-05-22T17:15:56Z

PR branch:

$ cargo bench --bench context_suppression
    Finished `bench` profile [optimized + debuginfo] target(s) in 0.13s
     Running benches/context_suppression.rs (/tmp/tbd_a/opentelemetry-rust/target/release/deps/context_suppression-3de0ed5c3fd35dc6)
Gnuplot not found, using plotters backend
telemetry_suppression/enter_telemetry_suppressed_scope
                        time:   [26.002 ns 26.039 ns 26.080 ns]
Found 4 outliers among 100 measurements (4.00%)
  1 (1.00%) high mild
  3 (3.00%) high severe
telemetry_suppression/normal_attach
                        time:   [26.186 ns 26.201 ns 26.218 ns]
Found 11 outliers among 100 measurements (11.00%)
  1 (1.00%) low severe
  2 (2.00%) low mild
  3 (3.00%) high mild
  5 (5.00%) high severe
telemetry_suppression/is_current_telemetry_suppressed_false
                        time:   [1.2984 ns 1.3000 ns 1.3016 ns]
Found 6 outliers among 100 measurements (6.00%)
  1 (1.00%) low mild
  2 (2.00%) high mild
  3 (3.00%) high severe
telemetry_suppression/is_current_telemetry_suppressed_true
                        time:   [1.2981 ns 1.3005 ns 1.3030 ns]
Found 8 outliers among 100 measurements (8.00%)
  1 (1.00%) low mild
  5 (5.00%) high mild
  2 (2.00%) high severe

Machine Conf:

$ lscpu | grep -i core
Model name:                           AMD EPYC 7763 64-Core Processor
Thread(s) per core:                   2
Core(s) per socket:                   8

$ cat /proc/meminfo | grep -E '^MemTotal'
MemTotal:       65794236 kB

$ uname -a
Linux CPC-labha-5U0JP 6.6.36.3-microsoft-standard-WSL2 #1 SMP PREEMPT_DYNAMIC Sat Jun 29 07:01:04 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux

bantonsson · 2025-08-25T10:44:12Z

I can't seem to get consistently good benchmark numbers across the different architectures, so I'll close this work for now.

bantonsson · 2025-08-26T13:39:42Z

Just reopening for another small test.

The code seems to be highly sensitive to alignment, so use a bitfield instead of a boolean.

bantonsson force-pushed the ban/context-suppression branch 3 times, most recently from d79a782 to dc038f9 Compare March 25, 2025 15:36

scottgerring added the performance label Mar 26, 2025

scottgerring closed this Mar 26, 2025

scottgerring reopened this Mar 26, 2025

bantonsson force-pushed the ban/context-suppression branch from dc038f9 to 06eb4a1 Compare March 27, 2025 08:10

bantonsson marked this pull request as ready for review March 27, 2025 10:28

bantonsson requested a review from a team as a code owner March 27, 2025 10:28

bantonsson force-pushed the ban/context-suppression branch 4 times, most recently from e3aca49 to ad3ad40 Compare March 28, 2025 09:14

bantonsson marked this pull request as draft March 28, 2025 14:58

bantonsson force-pushed the ban/context-suppression branch 6 times, most recently from baad2fe to 01874de Compare April 3, 2025 15:41

bantonsson changed the title ~~perf: Optimize cloning of Context since it is immutable~~ perf: Suppress telemetry using ContextFlags(usize) instead of bool Apr 4, 2025

bantonsson force-pushed the ban/context-suppression branch from 01874de to 3416528 Compare April 4, 2025 07:40

bantonsson marked this pull request as ready for review April 4, 2025 07:42

bantonsson force-pushed the ban/context-suppression branch from fe37e71 to 2f628ee Compare April 7, 2025 07:54

bantonsson mentioned this pull request Apr 9, 2025

REQUEST: New membership for @bantonsson open-telemetry/community#2649

Closed

6 tasks

lalitb self-assigned this Apr 29, 2025

bantonsson force-pushed the ban/context-suppression branch from 2f628ee to ca8789c Compare May 6, 2025 14:32

bantonsson force-pushed the ban/context-suppression branch from ca8789c to 38ff659 Compare May 14, 2025 11:59

bantonsson force-pushed the ban/context-suppression branch 2 times, most recently from 774c7ab to 0f9717a Compare August 25, 2025 09:37

bantonsson closed this Aug 25, 2025

bantonsson reopened this Aug 26, 2025

perf: Suppress telemetry using ContextFlags(usize) instead of bool

55cbad0

The code seems to be highly sensitive to alignment, so use a bitfield instead of a boolean.

bantonsson force-pushed the ban/context-suppression branch from 0f9717a to 55cbad0 Compare August 26, 2025 13:40

perf: Suppress telemetry using ContextFlags(usize) instead of bool #2861

Are you sure you want to change the base?

perf: Suppress telemetry using ContextFlags(usize) instead of bool #2861

Uh oh!

Conversation

bantonsson commented Mar 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Changes

Merge requirement checklist

Uh oh!

codecov bot commented Mar 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

scottgerring commented Mar 26, 2025

Uh oh!

bantonsson commented Mar 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

bantonsson commented Mar 28, 2025

Uh oh!

bantonsson commented Apr 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cijothomas commented Apr 4, 2025

Uh oh!

bantonsson commented Apr 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

bantonsson commented May 6, 2025

Uh oh!

bantonsson commented May 14, 2025

Uh oh!

cijothomas commented May 14, 2025

Uh oh!

bantonsson commented May 14, 2025

Uh oh!

cijothomas commented May 14, 2025

Uh oh!

cijothomas commented May 22, 2025

Uh oh!

lalitb commented May 22, 2025

Uh oh!

bantonsson commented Aug 25, 2025

Uh oh!

bantonsson commented Aug 26, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

bantonsson commented Mar 25, 2025 •

edited

Loading

codecov bot commented Mar 25, 2025 •

edited

Loading

bantonsson commented Mar 27, 2025 •

edited

Loading

bantonsson commented Apr 4, 2025 •

edited

Loading

bantonsson commented Apr 7, 2025 •

edited

Loading