-
Notifications
You must be signed in to change notification settings - Fork 98
feat(otlp): Infer span description for db
spans
#4541
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Changes from all commits
2716ef3
cf723e1
6d163f4
c47a7db
27ee011
68bca0d
6a3ad6d
9257e6a
e733731
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -644,6 +644,12 @@ fn normalize( | |
); | ||
span.sentry_tags = Annotated::new(tags); | ||
|
||
// This inference depends on `sentry_tags.category`, which is set during tag | ||
// extraction above. Please be careful if trying to reorder this operation. | ||
if span.description.value().is_empty() { | ||
span.description = infer_span_description(span).into(); | ||
} | ||
|
||
normalize_performance_score(span, performance_score); | ||
if let Some(model_costs_config) = ai_model_costs { | ||
extract_ai_measurements(span, model_costs_config); | ||
|
@@ -803,6 +809,17 @@ fn validate(span: &mut Annotated<Span>) -> Result<(), ValidationError> { | |
Ok(()) | ||
} | ||
|
||
// Infer a span's `description` based on its attributes. Note that tag | ||
// extraction must have already run before this is called, as it relies on the | ||
// values of `span.sentry_tags`. | ||
fn infer_span_description(span: &Span) -> Option<String> { | ||
let category = span.sentry_tags.value()?.category.value()?; | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Should this not use the original place where this tag is extracted from? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This tag is typically inferred from either the span op or a combination of its attributes, rather than being passed in directly. This inference happens in This code to set the span An alternative might be to make There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It seems weird to refer back to a tag to make the decision, but if that is currently the best way to do it, I am fine with it, especially since we have strongly typed tags now. Jan and I discussed some changes how we can improve the tag extraction and span enrichment we do. In regards to needing something similar/mirrored in Sentry for a while and starting to align for span streaming. So we hopefully can improve this case with that change as well. |
||
match category.as_str() { | ||
"db" => span.data.value()?.db_query_text.value()?.to_owned().into(), | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We already have a There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Does it make sense to move this kind of logic, which is conditional on other span fields and also doesn't move data, into the same spot? The other challenge is that we need to know the category, which isn't calculated yet at the time There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
I think so. Would be great if you can add this as a comment (if you end up staying with what we currently have):
It should be obvious (because it references the tags), but a lot of this extraction logic depends on implicit orders, so maybe doesn't hurt to call it out and explain why it runs late. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I commented the block where this happens in normalization, as well as in the function doc. I couldn't find a way to make the dependency more explicit in code under the current structure (passing category into You mentioned discussing updates to tag extraction + enrichment with Jan - my understanding is that sentry_tags as a concept is likely to disappear as well, as EAP already collapses them into standard attributes. Once Relay does the same, I think this whole method should be able to be ordered more naturally so the dependencies are clear. 🤞 |
||
_ => None, | ||
} | ||
} | ||
|
||
#[cfg(test)] | ||
mod tests { | ||
use std::collections::BTreeMap; | ||
|
@@ -1431,4 +1448,28 @@ mod tests { | |
&EventId("480ffcc911174ade9106b40ffbd822f5".parse().unwrap()) | ||
); | ||
} | ||
|
||
#[test] | ||
fn infers_db_span_description() { | ||
let mut span = Annotated::from_json( | ||
r#"{ | ||
"start_timestamp": 0, | ||
"timestamp": 1, | ||
"trace_id": "922dda2462ea4ac2b6a4b339bee90863", | ||
"span_id": "922dda2462ea4ac2", | ||
"data": { | ||
"db.query.text": "SELECT * FROM users WHERE id = 1", | ||
"sentry.category": "db" | ||
} | ||
}"#, | ||
) | ||
.unwrap(); | ||
|
||
normalize(&mut span, normalize_config()).unwrap(); | ||
|
||
assert_eq!( | ||
get_value!(span.description!).as_str(), | ||
"SELECT * FROM users WHERE id = 1" | ||
); | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does the SQL scrubber need to run over this field, also if the field gets copied into the description, is it being scrubbed?
We should have an integration test for this.
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The span
description
that we receive isn't scrubbed or otherwise modified. So writing the unmodifieddb.query.text
value intodescription
is expected.(The exception is PII scrubbing, although
description
isn't a default field so that relies on user configuration. That happens later in span processing so we should be covered there.)A scrubbed copy of
description
is made inextract_tags
and stored insentry_tags
'description
field. This scrubbed version is used by Sentry's Insights modules. That scrubbed description extraction does not run for the inferred descriptions in this PR, as the scrubbing happens inextract_tags
.That is fine, though: supporting Insights is a later milestone of the span first/OTLP project. It isn't expected that Insights works with OTLP. If it was simple enough to generate the scrubbed descriptions I'd consider it, but the scrubbing also currently depends on
op
- that has all got to be replaced too before it's usable with OTLP. We'll be changing how that works when we work on that milestone.I agree about an integration test though, we'll add one.