-
Notifications
You must be signed in to change notification settings - Fork 265
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow field-presence to be used in OTLP #336
Conversation
cc @jack-berg This is also blocking min/max on histogram using optional. Same fix for builds for non-go libraries |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This relies on an experimental capability of protoc. My understanding is that capability can be removed/changed anytime and thus does not belong to a stable protocol specification.
The official definition of the optional
tag is the following:
When a message is parsed, if it does not contain an optional element, the corresponding field in the parsed object is set to the default value for that field.
My reading of this is that the missing optional field cannot be distinguished from a present field that has the default value. So, if I understand correctly this proposal cannot solve #336 regardless of what Gogo proto does.
@@ -442,7 +442,7 @@ message HistogramDataPoint { | |||
// Negative events *can* be recorded, but sum should not be filled out when | |||
// doing so. This is specifically to enforce compatibility w/ OpenMetrics, | |||
// see: https://github.com/OpenObservability/OpenMetrics/blob/main/specification/OpenMetrics.md#histogram | |||
double sum = 5; | |||
optional double sum = 5; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Out of curiosity (and not having read any prior context), is there a semantic meaning to optional fields in OTLP? i.e., why would we allow SDKs to not populate this, given it can provide value? Or if it doesn't provide value why do we not just deprecate it for removal?
If it's about some fields being possibly useful if the backend understands it, then it's a concept I haven't seen before in OTel, which generally leans toward sending as much data as possible. If it is about that sort of efficiency though, I think we first need some sort of negotiation protocol defined, where the required fields (and semantic attributes!) are negotiated beforehand before sending - that would then make this sort of optionality make sense.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately, this one is about Prometheus compatibility. Prometheus histogram sum only allows monotonic sums, and OpenTelemetry can compute histograms with non-monotonic sums. This bit of OTLP only exists for prometheus compatibility, specifically knowing that sum-in-histogram can be a sum-in-prometheus. There are alternative ways to solve this that were discarded. If you read the description of the proto, this change merely brings this in line with what was decided for the v0.9 version of OTLP.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I figured it's something like that - but since SDKs currently don't know if their data will end up in Prometheus, I think either 1) they always fill it or 2) the ability to negotiate what they are required to fill is needed, isn't it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jsuereth Would be good to get input on this point about OTLP negotiation. I am curious on whether optional fields really make sense for OTLP given there would probably need to be some out-of-band way of telling what needs to be sent, which may negate the need for optional fields in the first place.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We chatted a bit offline about this.
The Sum field was specified as optional, we just didn't encode that in the proto correctly. The current specification states that sum will only be filled out when it is monotonic (and positive) as required for prometheus compatibility. We may alllow non-monotonic sums, but marking it optional does NOT fix that issue, that's a separate concern.
The note above this line is basically describing how negative events can be recorded (i.e. if you wire a Gauge or UpDownCounter up to this form of histogram, you cannot report sum
in OTLP right now).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OTLP negotiation
Just to add some colour to this.
By negotiation I assume you mean the receiver giving some instructions to the sender. This would be a significant shift in what OTLP today is and how it is used. It is essentially a uni-directional protocol from logical data flow perspective (plus acks and flow control in the opposite direction). We can add the opposite direction of data flow to OTLP, however the problem is that the entire Collector is built with a single direction of data flow in mind. It won't be an easy change to allow negotiations to happen between SDKs and backends if there is an intermediary Collector present.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I had misinterpreted the field as "only fill when backend is Prometheus", not "only fill for monotonic metrics, because a backend may be Prometheus" which seems to be the case. I don't think there's any need for negotiation for the latter and it keeps things simpler.
A possible alternate solution is to add a bit to flags to indicate whether the sum is 0 or unassigned. (Does it need to be different from FLAG_NO_RECORDED_VALUE?). We will need to think through the backward compatibility aspect of this too since metrics proto is considered stable. |
@tigrannajaryan As of this v3.15, field presence is part of protocol buffers (no longer experimental), see these notes.
We'd actually like to do better, but that PR was rejected for various reasons, see: #320 As a quick note, two important points:
|
Does this need to be distinguishable in the Collector (e.g. converting from OTLP to Prometheus)? We won't be able to implement it since Gogoproto does not support it.
Short-term I think we can't since we rely on Gogoproto in the Collector. Long term, probably we can, assuming we can move away from Gogoproto. |
@tigrannajaryan You missed the collector SiG where @bogdandrutu and I discussed this. I'm not willing to make this change until we have a soution for gogoproto. I'm investigating updating goproto itself (this change actually isn't too hard to add, it's just impossible to submit + merge a PR to gogoproto, so we'd need to fork/vendor) |
@jsuereth SGTM. I removed my block. Let's see if there is a solution with gogoproto. |
See #279. |
@jmacd Might be worth discussing offline. Any solution for Go (likely) requires vendoring GoGo proto for a patch allowing optional or creating our own proto generation plugin with go-specific optimsiations from GoGo proto. I'm investigating how big that would be right now. |
Just curious, was there any consideration given to using a one-field It does have the advantage of working with any proto3 implementation out there, not just modern ones so I think it deserves some consideration if it hasn't been given yet. Taking advantage of modern features where helpful is of course a fine reason to use |
If I remember correctly In Go |
Quick update on this:
|
Update on this, I believe we have a working solution for adding optional fields support in the collector. Please take a look at open-telemetry/opentelemetry-collector#4750 and add comments there if there are any concerns. Note: that PR was using the proto from #279, but the same work can be used to support this PR. |
@codeboten we really have a good solution now correct? |
@bogdandrutu yes the current fix in open-telemetry/opentelemetry-collector#4929 allows the collector to continue using the proto generation moving forward. This is good to move forward 👍 |
Close in favor of #366. |
DO NOT MERGE UNTIL GOGOPROTO SOLUTION IS FOUND
Update the build to allow field presence to be used in OTLP.
NOTE: GoGoProto does not support optional field presence. Investigating how long adding that would take, but it would necessitate forking GoGo proto.