Move ObservedEvent into crates/telemetry, consolidated with self_tracing::LogRecord #1818

jmacd · 2026-01-17T03:10:46Z

The ObservedEvent has associated flume channels and a connection with the existing metrics and admin component which make it an appealing way to transport log events in the engine.

Move PipelineKey, DeployedPipelineKey, CoreId types into crates/config.

Therefore, moving ObservedEvent into crates/telemetry lets us (optionally) use the same channel already use for lifecycle events for tokio log records. The existing event structure is extended with a EventMessage enum which supports None, String, or LogRecord messages. This way we can use a log record as the event message for all existing event types. The event.rs file moves, only ObservedEventRingBuffer from that file remains in crates/state.

The LogRecord has been storing a timestamp. Now, we leave that to the ObservedEvent struct. LogRecord passes through SystemTime everywhere it has been used. Callers generally compute this and pass it in. Minor cleanup in self_tracing/formatter.rs, do not pass SavedCallsite it can be calculated from record metadata as needed.

In internal_events, the raw_error! macro has been replaced with a helper to generate LogRecord values first, by level. This lets us pass info_event!("string", key=value) to any of the event constructors and construct an OTLP bytes message instead of a String message.

…d/move_event

codecov · 2026-01-17T03:13:01Z

Codecov Report

❌ Patch coverage is 52.88462% with 98 lines in your changes missing coverage. Please review.
✅ Project coverage is 84.72%. Comparing base (ede3e17) to head (7f9b088).
⚠️ Report is 1 commits behind head on main.

Additional details and impacted files

@@           Coverage Diff           @@
##             main    #1818   +/-   ##
=======================================
  Coverage   84.72%   84.72%           
=======================================
  Files         498      499    +1     
  Lines      146419   146467   +48     
=======================================
+ Hits       124048   124090   +42     
- Misses      21837    21843    +6     
  Partials      534      534

Components	Coverage Δ
otap-dataflow	`86.16% <52.88%> (+<0.01%)`	⬆️
query_abstraction	`80.61% <ø> (ø)`
query_engine	`90.52% <ø> (ø)`
syslog_cef_receivers	`∅ <ø> (∅)`
otel-arrow-go	`53.50% <ø> (ø)`
quiver	`90.66% <ø> (ø)`

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

lquerel

I have a few questions that I don't think are necessarily blocking. I believe I will need the other PRs before having a complete, end to end view of the approach.

The main point that seems important to me to guarantee in the long term is that we must have a way to compile the engine in a mode where only vital events are compiled in, and all others are completely eliminated. I don't have the impression that this is in place yet, so I'm waiting to see how this will be implemented in the following PRs.

rust/otap-dataflow/crates/telemetry/src/event.rs

lquerel · 2026-01-19T17:32:47Z

rust/otap-dataflow/crates/telemetry/src/internal_events.rs

 macro_rules! raw_error {
    ($name:expr $(, $($fields:tt)*)?) => {{
-        use $crate::self_tracing::{ConsoleWriter, RawLoggingLayer};
+        use $crate::self_tracing::ConsoleWriter;
+        let now = std::time::SystemTime::now();
+        let record = $crate::error_event!($name $(, $($fields)*)?);
+        ConsoleWriter::no_color().print_log_record(now, &record);
+    }};
+}


If I understand correctly, we should use raw_error! with great care, since there is no way to eliminate its cost, neither at compile time nor at runtime. Is that correct?
Also, for println!, we rely on Clippy to catch it in CI (and prevent them to end up in regular code). So what should we do for raw_error!?

Yes. There will be very few of these, for cases where all/else fails. Is it difficult to configure clippy with a similar rule for raw_error? This statement is the same as println/eprintln but with the structured syntax of Tokio tracing and use of our code path.
I can see an argument that having raw_error become an eprintln!() statement makes sense; my preference is to use the same instrumentation interface use everywhere, so my goal is to replace log::error! (formatting) with tracing::error! or otel_error! (structured) and eprintln! (formatting) with raw_error! (structured).

lquerel · 2026-01-19T17:38:22Z

rust/otap-dataflow/crates/telemetry/src/internal_events.rs

+        let callsite = record.callsite();
+        assert_eq!(*callsite.level(), Level::INFO);
+        assert_eq!(callsite.name(), "test.event");


Is the line number returned by the callsite correct?
I'm asking because I'm not sure I understand how the call site behaves with the nesting of the macros defined earlier.

I added a test. These macros are designed for the kind of nesting here, IIUC.

…macd/move_event

jmacd · 2026-01-19T19:24:37Z

long term is that we must have a way to compile the engine in a mode where only vital events are compiled in, and ...

I can see your concern. I showed how to use special macros to encode LogRecord (i.e., structured body and key/values) with static callsite, but the current lifecycle events are dispatched unconditionally. I think you would prefer to see structured errors, but I also see formatting happening in the existing events.

My goals here are:

the admin thread can be assigned the role of collecting LogRecords from subscribers, this for the ConsoleAsync mode of internal logging (this is the fallback of not using ITS i.e., not using an internal pipeline). This means we can send logs on the same channel as the current lifecycle events.
To suggest that anywhere you see an Option with format!(...) body you might instead use an info_event!(...) macro to encode OTLP bytes and use that instead of formatting, defer formatting to the consumer if at all

Answering your concern at the top, this leaves a slight misalignment. We're using Tokio macros to construct LogRecords which are the encoded parts and the Metadata i.e., callsite data. Your question is about whether can can compile-time disable these events.

Yes, we can but not as written. Tokio macros have support for compile-time disablement base on Level, part of every callsite metadata. The fact that we have error_event! and info_event! is to suggest that we could choose to compile with only error events, but we'd have to modify ObservedEvent to treat those as the EventMessage::None value vs the EventMessage::Log event.

…d/move_event

jmacd added 6 commits January 16, 2026 11:01

Move ObservedEvent and related types into crates/telemetry

c8dacb5

wip

82d8512

Merge branch 'main' of github.com:open-telemetry/otel-arrow into jmac…

b20e77d

…d/move_event

remove ts from log record

b5f2ed0

fmt

a96cc6f

Merge branch 'main' of github.com:open-telemetry/otel-arrow into jmac…

1740420

…d/move_event

jmacd requested a review from a team as a code owner January 17, 2026 03:10

github-project-automation bot added this to OTel-Arrow Jan 17, 2026

github-actions bot added the rust Pull requests that update Rust code label Jan 17, 2026

example

1abe9fd

lquerel approved these changes Jan 19, 2026

View reviewed changes

jmacd and others added 3 commits January 19, 2026 11:02

Update rust/otap-dataflow/crates/telemetry/src/event.rs

e9449b0

test line num

a644492

Merge branch 'jmacd/move_event' of github.com:jmacd/otel-arrow into j…

29cce61

…macd/move_event

Merge branch 'main' of github.com:open-telemetry/otel-arrow into jmac…

c4c4cf4

…d/move_event

jmacd enabled auto-merge January 19, 2026 19:35

test clippy

7f9b088

jmacd added this pull request to the merge queue Jan 19, 2026

Merged via the queue into open-telemetry:main with commit 1b503f3 Jan 19, 2026
43 of 44 checks passed

jmacd deleted the jmacd/move_event branch January 19, 2026 20:48

github-project-automation bot moved this to Done in OTel-Arrow Jan 19, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Move ObservedEvent into crates/telemetry, consolidated with self_tracing::LogRecord #1818

Move ObservedEvent into crates/telemetry, consolidated with self_tracing::LogRecord #1818

Uh oh!

jmacd commented Jan 17, 2026

Uh oh!

codecov bot commented Jan 17, 2026 •

edited

Loading

Uh oh!

lquerel left a comment

Uh oh!

Uh oh!

lquerel Jan 19, 2026

Uh oh!

jmacd Jan 19, 2026

Uh oh!

lquerel Jan 19, 2026

Uh oh!

jmacd Jan 19, 2026

Uh oh!

jmacd commented Jan 19, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Move ObservedEvent into crates/telemetry, consolidated with self_tracing::LogRecord #1818

Move ObservedEvent into crates/telemetry, consolidated with self_tracing::LogRecord #1818

Uh oh!

Conversation

jmacd commented Jan 17, 2026

Uh oh!

codecov bot commented Jan 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

lquerel left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

lquerel Jan 19, 2026

Choose a reason for hiding this comment

Uh oh!

jmacd Jan 19, 2026

Choose a reason for hiding this comment

Uh oh!

lquerel Jan 19, 2026

Choose a reason for hiding this comment

Uh oh!

jmacd Jan 19, 2026

Choose a reason for hiding this comment

Uh oh!

jmacd commented Jan 19, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

codecov bot commented Jan 17, 2026 •

edited

Loading