Skip to content

feat(trace-exporter): handle exceptions when creating Trace Exporter in TracerManager #6925

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 96 commits into
base: ganeshnj/feature/datapipeline-bindings
Choose a base branch
from

Conversation

ganeshnj
Copy link
Contributor

@ganeshnj ganeshnj commented May 5, 2025

Summary of changes

Guard against library API or load failures.

Reason for change

We don't want to just fail but fallback on managed trace exporter.

Implementation details

Test coverage

Other details

ganeshnj added 30 commits March 12, 2025 15:36
# Conflicts:
#	tracer/build/_build/Build.Steps.cs
# Conflicts:
#	tracer/test/Datadog.Trace.IntegrationTests/ContainerTaggingTests.cs
* Updated the `libdatadog` package reference in both `Console.csproj` and `Datadog.Trace.csproj`.
* Added the new package `libdatadog.60141209.0.0.nupkg`.
ganeshnj added 14 commits May 6, 2025 14:27
## Summary of changes

I noticed that from the benchmark runs, there no telemetry events are
being produced. We want to pass the same telemetry configs as we have in
the Tracer to native side to be able to produce telemetry events.

## Reason for change

There is no telemetry currently being produced for trace exporter.

## Implementation details

* Introduced `TelemetryClientConfiguration` struct for telemetry
settings.
* Updated `TraceExporterConfiguration` to include telemetry
configuration.
* Added P/Invoke method `EnableTelemetry` in `NativeInterop` for
enabling telemetry.

## Test coverage

None

## Other details
<!-- Fixes #{issue} -->

<!-- ⚠️ Note: where possible, please obtain 2 approvals prior to
merging. Unless CODEOWNERS specifies otherwise, for external teams it is
typically best to have one review from a team member, and one review
from apm-dotnet. Trivial changes do not require 2 reviews. -->
@ganeshnj ganeshnj force-pushed the ganeshnj/feat/guard-against-lib-failure branch from a496c90 to 603d4f4 Compare May 12, 2025 13:26
@ganeshnj ganeshnj force-pushed the ganeshnj/feat/guard-against-lib-failure branch from 603d4f4 to d4e5171 Compare May 22, 2025 11:41
Copy link
Member

@andrewlock andrewlock left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@@ -415,7 +415,7 @@ _ when x.ToBoolean() is { } boolean => boolean,

DataPipelineEnabled = config
.WithKeys(ConfigurationKeys.TraceDataPipelineEnabled)
.AsBool(defaultValue: true);
.AsBool(defaultValue: false);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you want to leave this as true just for now, so that we keep testing the datapipeline-binding branch with this enabled everywhere by default? I'm happy if you want to disable this now, but maybe it makes sense to move that to a separate tiny PR that we just merge at the last moment?

if (!error.IsInvalid)
{
var ex = error.ToException();
_log.Error(ex, "An error occurred while sending data to the agent. Error Code: {ErrorCode}, message: {Message}", ex.ErrorCode, ex.Message);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I realise it's not a change from this PR, but I'm wondering if we should make a slight tweak here so that the ErrorCode is properly embedded in the string instead of using structured logging 🤔 That way we will see the error code in telemetry (otherwise we will only see that an error did occur, not what it was). Given that we know the error code is just a number, it's "safe" to do

e.g. something like this:

Suggested change
_log.Error(ex, "An error occurred while sending data to the agent. Error Code: {ErrorCode}, message: {Message}", ex.ErrorCode, ex.Message);
_log.Error(ex, "An error occurred while sending data to the agent. Error Code: " + ex.ErrorCode.ToString(CultureInfo.InvariantCulture) + ", message: {Message}", ex.Message);

We'd have to add a #pragma to disable the linting rule that prevents you doing that, but might be worth it. WDYT?

_log.Error(ex, "An error occurred while sending data to the agent. Error Code: {ErrorCode}, message: {Message}", ex.ErrorCode, ex.Message);
throw ex;
using var error = NativeInterop.Exporter.Send(this, tracesSlice, (UIntPtr)numberOfTraces, ref responsePtr);
if (!error.IsInvalid)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still find this so confusing 😅 error.IsInvalid means there wasn't an error?!

@ganeshnj ganeshnj force-pushed the ganeshnj/feature/datapipeline-bindings branch from 15af3ce to 34cbacb Compare May 27, 2025 09:23
@ganeshnj ganeshnj requested review from a team as code owners May 27, 2025 09:23
@ganeshnj ganeshnj force-pushed the ganeshnj/feature/datapipeline-bindings branch from 34cbacb to 89bc0ec Compare May 27, 2025 11:03
/// <param name="str">The string to copy into memory.</param>
internal CharSlice(string? str)
{
if (str == null)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (str == null)
if (string.IsNullOrEmpty(str))

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants