Skip to content

Fix OtelLogRecordProcessor schedule#11315

Merged
gh-worker-dd-mergequeue-cf854d[bot] merged 2 commits intomasterfrom
mcculls/fix-otlp-log-schedule
May 8, 2026
Merged

Fix OtelLogRecordProcessor schedule#11315
gh-worker-dd-mergequeue-cf854d[bot] merged 2 commits intomasterfrom
mcculls/fix-otlp-log-schedule

Conversation

@mcculls
Copy link
Copy Markdown
Contributor

@mcculls mcculls commented May 7, 2026

What Does This Do

  • if we have enough logs for a batch, keep sending batched logs
  • otherwise wait for the interval to elapse, then send what we have

Motivation

Better alignment with OpenTelemetry SDK behaviour: https://github.com/open-telemetry/opentelemetry-java/blob/main/sdk/logs/src/main/java/io/opentelemetry/sdk/logs/export/BatchLogRecordProcessor.java#L206

Contributor Checklist

Jira ticket: [PROJ-IDENT]

Note: Once your PR is ready to merge, add it to the merge queue by commenting /merge. /merge -c cancels the queue request. /merge -f --reason "reason" skips all merge queue checks; please use this judiciously, as some checks do not run at the PR-level. For more information, see this doc.

* if we have enough logs for a batch, keep sending batched logs
* otherwise wait for the interval to elapse, then send what we have
@mcculls mcculls added type: bug Bug report and fix inst: opentelemetry OpenTelemetry instrumentation labels May 7, 2026
@mcculls mcculls marked this pull request as ready for review May 7, 2026 23:11
@mcculls mcculls requested review from a team as code owners May 7, 2026 23:11
@mcculls mcculls requested review from dougqh and mtoffl01 and removed request for a team May 7, 2026 23:11
logsReady.poll(waitNanos, TimeUnit.NANOSECONDS);
}
} catch (InterruptedException ignore) {
break;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As general practice, we should reset the interrupt status after catching InterruptedException.

Copy link
Copy Markdown
Contributor Author

@mcculls mcculls May 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general yes - but only if we want to preserve and report the interrupted status to callers, which is not the case here.

In this situation we want to bail out of the method and return what we've batched so far (so it isn't dropped.) The caller then may decide to invoke waitForLogs again, and we don't want to have the interrupted status kept as then poll would immediately fail again.

We could potentially let the InterruptedException bubble up, but that would lead to a loss of logs if we're interrupting because of shutdown, where we want to interrupt any blocking while waiting for enough logs for the batch but still have one last attempt to flush what we have.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll add a comment to make this particular situation clear for future devs

thread.interrupt();
try {
thread.join(1_000);
} catch (InterruptedException ignore) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm okay with foregoing reseting interrupt here, since we're mid shutdown.
However, a comment explaining that would be good.

Copy link
Copy Markdown
Contributor

@dougqh dougqh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good
Some minor comments about interrupt handling

@mcculls mcculls enabled auto-merge May 8, 2026 15:35
@mcculls mcculls added this pull request to the merge queue May 8, 2026
@dd-octo-sts
Copy link
Copy Markdown
Contributor

dd-octo-sts Bot commented May 8, 2026

/merge

@gh-worker-devflow-routing-ef8351
Copy link
Copy Markdown

gh-worker-devflow-routing-ef8351 Bot commented May 8, 2026

View all feedbacks in Devflow UI.

2026-05-08 16:01:51 UTC ℹ️ Start processing command /merge


2026-05-08 16:01:56 UTC ℹ️ MergeQueue: pull request added to the queue

The expected merge time in master is approximately 1h (p90).


2026-05-08 17:11:28 UTC ℹ️ MergeQueue: This merge request was merged

@github-merge-queue github-merge-queue Bot removed this pull request from the merge queue due to failed status checks May 8, 2026
@gh-worker-dd-mergequeue-cf854d gh-worker-dd-mergequeue-cf854d Bot merged commit be0f519 into master May 8, 2026
571 checks passed
@gh-worker-dd-mergequeue-cf854d gh-worker-dd-mergequeue-cf854d Bot deleted the mcculls/fix-otlp-log-schedule branch May 8, 2026 17:11
@github-actions github-actions Bot added this to the 1.63.0 milestone May 8, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

inst: opentelemetry OpenTelemetry instrumentation type: bug Bug report and fix

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants