Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OM 2.0: Native Histogram Support in Text format. #279

Open
bwplotka opened this issue Nov 28, 2024 · 8 comments
Open

OM 2.0: Native Histogram Support in Text format. #279

bwplotka opened this issue Nov 28, 2024 · 8 comments

Comments

@bwplotka
Copy link
Member

Following the accepted proposal prometheus/proposals#32 we should be ready to apply the changes to the specification.

However, we don't plan to release 1.1, so I would propose to do it in 2.0 (we might want to change that detail in the proposal @csmarchbanks)

cc @vesari @beorn7

@bwplotka
Copy link
Member Author

bwplotka commented Dec 9, 2024

See the prometheus/docs#2539 for the full native histogram spec.

The OpenMetrics and protobuf state is explained here

@bwplotka
Copy link
Member Author

bwplotka commented Dec 11, 2024

From the WG meeting:

  • Would be nice to see a POC of spec (cc @vesari) and parser (who?) and benchmarks (cc @csmarchbanks)

@beorn7
Copy link
Member

beorn7 commented Dec 11, 2024

Parser reference implementation in client_python (IIUC): prometheus/client_python#1040

@vesari
Copy link

vesari commented Dec 11, 2024

@ywwg also worked on that part of the code afterwards and consolidated it in a UTF-8 related PR

@vesari
Copy link

vesari commented Dec 11, 2024

I've resumed working on the exposition part as we speak, btw

@bwplotka
Copy link
Member Author

bwplotka commented Dec 11, 2024

BTW: We chatted in a bunch of places (prometheus/docs#2539 (comment), #283 (comment)) about nhcb in exposition format and potentially controversial idea of OM 2.0 text having no classic histograms and only nhcb (anyone can generate classic histograms on their own).

As a result two areas I would love to learn about the accepted proposal before merging:

  1. Is there a room to extend it to nhcb? Can we do it now? How to do it later?
  2. If we prioritize readability more, especially for nhcb to somewhat be on par with classic histogram readability (while introducing the structure)... did we ever consider non-sparse text representation? While less efficient, it's much easier to read:
hist_with_labels{foo="bar",baz="qux"} {count:24,sum:100, buckets: [0.001, 0.1, 100, 300], observations: [20, 3, 2, 1]}

Than:

hist_with_labels{foo="bar",baz="qux"} {count:24,sum:100,schema:-53,zero_threshold:0.001,zero_count:4,positive_spans:[0:2,1:2],positive_deltas:[2,1,-2,3],custom_buckets:[2,1,-2,3]}

I guess for (2) we could only do this for nhcb (and make no schema equal -53) 🤔 I guess this deserves another proposal (:

@beorn7
Copy link
Member

beorn7 commented Dec 11, 2024

I see NHCB mostly as a concept within Prometheus. Technically, an explicit representation of NHCB in the exposition format would transport the exact same things as the current representation of classic histograms. (NHCB was designed to precisely emulate the data model of a classic histogram.)

We already have #283 to create a one-line "complex value" representation for summaries and classic histograms. If we do that, it feels almost like a needless complication to create an explicit NHCB representation in the exposition format. I see it as an advantage that the instrumentation just exposes a "fixed bucket histogram", and scrapers can decide in which way they want to ingest it (e.g. classic histogram, NHCB, OTel fixed bucket histogram, …).

However, this issue is about the text format for native histograms in general, which should proceed as planned, unless there are now new insights that prompt for changes of plan. I think @bwplotka's train of thought got started because I described the possibility how the native histogram exposition just needs custom values added to also transport NHCB, which is true. This is easy in protobuf, but less easy in the text format. Such extensions can still be done in the future, but as explained above, I don't see an immediate need, and maybe there is no need ever (if we improve the classic histogram representation in OM 2.0).

@csmarchbanks
Copy link
Member

Is there a room to extend it to nhcb? Can we do it now? How to do it later?

This was discussed in the original proposal, as it stands the advice is to just use classic buckets. Easy enough to add the custom_buckets field as you show if we desire.

If we prioritize readability more, especially for nhcb to somewhat be on par with classic histogram readability (while introducing the structure)... did we ever consider non-sparse text representation? While less efficient, it's much easier to read:

What you describe is proposal 5 in the original design doc for choosing a text representation option. A format like that requires significant complexity for Prometheus when parsing the format to change back into the internal representation. It would also significantly diverge the text and protobuf exposition formats.

I liked the idea in a comment of that doc to provide a promtool command that could generate an output or visualization for an internal representation of a native histogram. That keeps parsing complexity lower for the common scrape use case but still allows deeper debugging if required.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Todo
Development

No branches or pull requests

4 participants