-
Notifications
You must be signed in to change notification settings - Fork 67
Move ObservedEvent into crates/telemetry, consolidated with self_tracing::LogRecord #1818
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #1818 +/- ##
=======================================
Coverage 84.72% 84.72%
=======================================
Files 498 499 +1
Lines 146419 146467 +48
=======================================
+ Hits 124048 124090 +42
- Misses 21837 21843 +6
Partials 534 534
🚀 New features to boost your workflow:
|
lquerel
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a few questions that I don't think are necessarily blocking. I believe I will need the other PRs before having a complete, end to end view of the approach.
The main point that seems important to me to guarantee in the long term is that we must have a way to compile the engine in a mode where only vital events are compiled in, and all others are completely eliminated. I don't have the impression that this is in place yet, so I'm waiting to see how this will be implemented in the following PRs.
| macro_rules! raw_error { | ||
| ($name:expr $(, $($fields:tt)*)?) => {{ | ||
| use $crate::self_tracing::{ConsoleWriter, RawLoggingLayer}; | ||
| use $crate::self_tracing::ConsoleWriter; | ||
| let now = std::time::SystemTime::now(); | ||
| let record = $crate::error_event!($name $(, $($fields)*)?); | ||
| ConsoleWriter::no_color().print_log_record(now, &record); | ||
| }}; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I understand correctly, we should use raw_error! with great care, since there is no way to eliminate its cost, neither at compile time nor at runtime. Is that correct?
Also, for println!, we rely on Clippy to catch it in CI (and prevent them to end up in regular code). So what should we do for raw_error!?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. There will be very few of these, for cases where all/else fails. Is it difficult to configure clippy with a similar rule for raw_error? This statement is the same as println/eprintln but with the structured syntax of Tokio tracing and use of our code path.
I can see an argument that having raw_error become an eprintln!() statement makes sense; my preference is to use the same instrumentation interface use everywhere, so my goal is to replace log::error! (formatting) with tracing::error! or otel_error! (structured) and eprintln! (formatting) with raw_error! (structured).
| let callsite = record.callsite(); | ||
| assert_eq!(*callsite.level(), Level::INFO); | ||
| assert_eq!(callsite.name(), "test.event"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the line number returned by the callsite correct?
I'm asking because I'm not sure I understand how the call site behaves with the nesting of the macros defined earlier.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a test. These macros are designed for the kind of nesting here, IIUC.
I can see your concern. I showed how to use special macros to encode LogRecord (i.e., structured body and key/values) with static callsite, but the current lifecycle events are dispatched unconditionally. I think you would prefer to see structured errors, but I also see formatting happening in the existing events. My goals here are:
Answering your concern at the top, this leaves a slight misalignment. We're using Tokio macros to construct LogRecords which are the encoded parts and the Metadata i.e., callsite data. Your question is about whether can can compile-time disable these events. Yes, we can but not as written. Tokio macros have support for compile-time disablement base on Level, part of every callsite metadata. The fact that we have error_event! and info_event! is to suggest that we could choose to compile with only error events, but we'd have to modify ObservedEvent to treat those as the EventMessage::None value vs the EventMessage::Log event. |
1b503f3
The ObservedEvent has associated flume channels and a connection with the existing metrics and admin component which make it an appealing way to transport log events in the engine.
Move PipelineKey, DeployedPipelineKey, CoreId types into crates/config.
Therefore, moving ObservedEvent into crates/telemetry lets us (optionally) use the same channel already use for lifecycle events for tokio log records. The existing event structure is extended with a
EventMessageenum which supports None, String, or LogRecord messages. This way we can use a log record as the event message for all existing event types. Theevent.rsfile moves, only ObservedEventRingBuffer from that file remains in crates/state.The LogRecord has been storing a timestamp. Now, we leave that to the ObservedEvent struct. LogRecord passes through SystemTime everywhere it has been used. Callers generally compute this and pass it in. Minor cleanup in self_tracing/formatter.rs, do not pass SavedCallsite it can be calculated from record metadata as needed.
In internal_events, the raw_error! macro has been replaced with a helper to generate LogRecord values first, by level. This lets us pass
info_event!("string", key=value)to any of the event constructors and construct an OTLP bytes message instead of a String message.