-
Notifications
You must be signed in to change notification settings - Fork 392
[PROF-11311] Remove datadog_profiling_loader #4356
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
**What does this PR do?** This PR removes the `datadog_profiling_loader` native extension, as it's no longer needed. The profiling native loader was added in #2003 . Quoting from that PR's description: > When Ruby loads a native extension (using `require`), it uses > `dlopen(..., RTLD_LAZY | RTLD_GLOBAL)` > (https://github.com/ruby/ruby/blob/67950a4c0a884bdb78d9beb4405ebf7459229b21/dln.c#L362). > > This means that every symbol exposed directly or indirectly by that > native extension becomes visible to every other extension in the > Ruby process. This can cause issues, see > rubyjs/mini_racer#179 . > > Ruby's extension loading mechanism is not configurable -- there's > no way to tell it to use different flags when calling `dlopen`. > To get around this, this commit introduces introduces another > extension (profiling loader) which has only a single responsibility: > mimic Ruby's extension loading mechanism, but when calling > `dlopen` use a different set of flags. > > This idea was shamelessly stolen from @lloeki's work in > rubyjs/mini_racer#179, big thanks! ...and importantly ends with: > Note that, that we know of, the profiling native extension only > exposes one potentially problematic symbol: > `rust_eh_personality` (coming from libddprof/libdatadog). > > Future versions of Rust have been patched not to expose this > (see rust-lang/rust#95604 (comment)) > so we may want to revisit the need for this loader in the future, > and perhaps delete it if we no longer require its services :) And we have reached the situation predicted in that description: 1. Nowadays libdatadog no longer exposes `rust_eh_personality` 2. We have a test that validates that only expected symbols are exported by the libdatadog library (see DataDog/libdatadog#573 ). Any new symbols that show up would break shipping new libdatadog versions to rubygems.org until we review them. 3. The `libdatadog_api` extension, which we've been shipping for customers since release 2.3.0 back in July 2024 has always been loaded directly without a loader without issues. Thus, I think it's the right time to get rid of the loader. **Motivation:** Having the loader around is not zero cost; we've run into/caused a few issues ( #2250 and #3582 come to mind). It also adds overhead to development: every time we need to rebuild the extensions, it's an extra extension that needs to be prepared, rebuilt, copied, etc. I've been meaning to get rid of the loader for some time now, and this came up again this week so I decided it was time to do it. **Additional Notes:** In the future, we can always ressurrect this approach if we figure out we need it again. We've also discussed internally about proposing upstream to the Ruby VM to maybe add an API to do what the loader was doing, so that we wouldn't need the weird workaround. We haven't yet decided if we're going to do that (help welcome!). **How to test the change?** The loader was responsible for loading the rest of profiling. Thus, any test that uses profiling was also validating the loader and now will validate that we're doing fine without it. TL;DR green CI is good.
Datadog ReportBranch report: ✅ 0 Failed, 22075 Passed, 1477 Skipped, 6m 23.79s Total Time |
BenchmarksBenchmark execution time: 2025-02-07 11:14:50 Comparing candidate commit db67572 in PR branch Found 1 performance improvements and 0 performance regressions! Performance is the same for 30 metrics, 2 unstable metrics. scenario:profiler - sample timeline=false
|
This was an interesting one... Once we removed the profiling loader extension, our tests using Ractors started failing with: > Ractor::UnsafeError: ractor unsafe method called from not main ractor When I started investigating with gdb, I discovered that because we were initializing our extension without going through Ruby, we were skipping this part: https://github.com/ruby/ruby/blob/7178593558080ca529abb61ef27038236ab2687d/load.c#L1301-L1302 : ```c ext_config_push(th, &prev_ext_config); handle = rb_vm_call_cfunc(rb_vm_top_self(), load_ext, path, VM_BLOCK_HANDLER_NONE, path); ``` that is, the `ext_config_push` is what's used by Ruby to validate if a gem is safe to use from Ractors or not. (If you're curious, that value then affects function definition, which controls wether Ruby will check or not for being called from a Ractor). Because we were skipping this entire mechanism, we implicitly were getting the same result as `rb_ext_ractor_safe(true)`. Once we removed the loader, this started failing. And I missed it before opening my PR since for reasons documented in the profiler ractor tests in detail, we don't run ractor-related tests by default. So this issue is one more reason why having the loader may create its own set of issues and why I'm happy to get rid of it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🎉
Can you tell it's Friday?
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #4356 +/- ##
==========================================
- Coverage 97.73% 97.72% -0.01%
==========================================
Files 1368 1367 -1
Lines 83014 82963 -51
Branches 4221 4213 -8
==========================================
- Hits 81134 81077 -57
- Misses 1880 1886 +6 ☔ View full report in Codecov by Sentry. |
What does this PR do?
This PR removes the
datadog_profiling_loader
native extension, as it's no longer needed.The profiling native loader was added in
#2003 .
Quoting from that PR's description:
...and importantly ends with:
And we have reached the situation predicted in that description:
Nowadays libdatadog no longer exposes
rust_eh_personality
We have a test that validates that only expected symbols are exported by the libdatadog library (see [NO-TICKET] Validate that all public symbols are prefixed in Ruby releases libdatadog#573 ).
Any new symbols that show up would break shipping new libdatadog versions to rubygems.org until we review them.
The
libdatadog_api
extension, which we've been shipping for customers since release 2.3.0 back in July 2024 has always been loaded directly without a loader without issues.Thus, I think it's the right time to get rid of the loader.
Motivation:
Having the loader around is not zero cost; we've run into/caused a few issues ( #2250 and #3582 come to mind).
It also adds overhead to development: every time we need to rebuild the extensions, it's an extra extension that needs to be prepared, rebuilt, copied, etc.
I've been meaning to get rid of the loader for some time now, and this came up again this week so I decided it was time to do it.
Change log entry
None.
Additional Notes:
Note that none if this is visible to customers -- the loader was just an additional layer we used when loading the profiler.
In the future, we can always resurrect this approach if we figure out we need it again.
We've also discussed internally about proposing upstream to the Ruby VM to maybe add an API to do what the loader was doing, so that we wouldn't need the weird workaround.
We haven't yet decided if we're going to do that (help welcome!).
How to test the change?
The loader was responsible for loading the rest of profiling.
Thus, any test that uses profiling was also validating the loader and now will validate that we're doing fine without it.
TL;DR green CI is good.