-
Notifications
You must be signed in to change notification settings - Fork 391
DEBUG-3573 send AppClientConfigurationChange telemetry event when subsequent telemetry instances start #4678
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
Thank you for updating Change log entry section 👏 Visited at: 2025-05-27 21:11:12 UTC |
Datadog ReportBranch report: ✅ 0 Failed, 21706 Passed, 1299 Skipped, 4m 28.29s Total Time |
8757579
to
a616bbd
Compare
a616bbd
to
0af0a62
Compare
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #4678 +/- ##
==========================================
- Coverage 97.64% 97.64% -0.01%
==========================================
Files 1470 1474 +4
Lines 87685 87829 +144
Branches 4544 4552 +8
==========================================
+ Hits 85624 85760 +136
- Misses 2061 2069 +8 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
721be22
to
8196384
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The change is a very good one, but I think we should consider moving it into the components tree instead of introducing new abstraction somewhere else.
class ComponentsState | ||
def initialize(components) | ||
@telemetry_enabled = components.telemetry.enabled | ||
@remote_started = components.remote&.started? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need to convert it into the boolean, otherwise we may return nil
irb(main):001> nil&.started?
=> nil
@remote_started = components.remote&.started? | |
@remote_started = !!components.remote&.started? | |
`` |
module Core | ||
module Configuration | ||
# Stores the state of component tree when replacing the tree. | ||
class ComponentsState |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a question about design decision here.
I see the ease of use of ComponentsState as a separate abstraction which is able to analyze some details about the components tree, but why it's not a part of the components tree itself?
A good example of the current design flaw is this part
telemetry.emit_closing! unless replacement&.telemetry&.enabled
when inside the components itself we have to use directly some internals of the other components tree, instead of asking the components tree itself (violation of encapsulation), like this:
telemetry.emit_closing! unless replacement.telemetry_enabled?
IMHO we should incorporate the state into components tree and ask questions, instead of introducing the missing API somewhere else.
What does this PR do?
When tracer is reconfigured (via
Datadog.configure
), and telemetry is enabled in both previous and new configurations, telemetry will now emitAppClientConfigurationChange
from the second instance when previously no event was emitted.Motivation:
Previously, only
AppStarted
event was emitted for programmatic and environment configuration, and only the first time the tracer was configured. This caused configuration state to not be sent to the backend whenDatadog.configure
was used to enable a product (for example, dynamic instrumentation).Change log entry
Yes: improve dynamic instrumentation UI reporting of application and host status
Additional Notes:
AppClientConfigurationChange
currently sends the complete configuration (same as whatAppStarted
would have sent). Ideally it should only send changed parameters.This PR does not change behavior of tracer when telemetry is turned on or off. For example, turning telemetry off would NOT presently produce
AppClosing
event (assuming tracer is reconfigured rather than shut down).Besides
AppClientConfigurationChange
, there is alsoAppIntegrationsChange
. This event is not changed by this PR, it is still not being sent on second and subsequent telemetry initializations. It probably should be sent in theory but there is no immediate demand for it. (And it should also send changed state, not all state.)This PR also makes two fixes to telemetry:
Previously, submitting events to telemetry would discard them if the initial event (used to be AppStarted, now it's either AppStarted or
AppClientConfigurationChange
) hasn't been sent yet. Now the submitted events are added to the queue and will be sent out after the initial event. The queue is (and was) already bounded. Since the initial event can fail and will be retried, there is a window of what looks to be 50 seconds where events could have been getting discarded).The initial event is sent when the worker is performing sending of some other event. Previously, If initial event fails to send, the other event would be dropped. Now the dequeued event is put back into the queue. (Currently the dequeued event is added to the end of the queue which is not the correct place for it, to avoid shifting all other events down one.)
How to test the change?
Integration tests have been added that verify the
AppClientConfigurationChange
is being sent when dynamic instrumentation is enabled programmatically along with several other smaller scope tests.The change is also tested manually.