Design doc on extending the query lifecycle tables #32504

ggevay · 2025-05-15T12:37:31Z

This just mostly copies over the things from https://github.com/MaterializeInc/database-issues/issues/9140#issuecomment-2876868717
plus fills in some sections of the design doc template.

Rendered: https://github.com/ggevay/materialize/blob/query-lifecycle-design/doc/developer/design/20250515_query_lifecycle_extensions.md

Motivation

Design doc for https://github.com/MaterializeInc/database-issues/issues/9140#issuecomment-2876868717, so that we can have visibility into end-to-end query latency at, e.g., https://github.com/MaterializeInc/accounts/issues/3

Tips for reviewer

Checklist

This PR has adequate test coverage / QA involvement has been duly considered. (trigger-ci for additional test/nightly runs)
This PR has an associated up-to-date design doc, is a design doc (template), or is sufficiently small to not require a design.
If this PR evolves an existing $T ⇔ Proto$T mapping (possibly in a backwards-incompatible way), then it is tagged with a T-proto label.
If this PR will require changes to cloud orchestration or tests, there is a companion cloud PR to account for those changes that is tagged with the release-blocker label (example).
If this PR includes major user-facing behavior changes, I have pinged the relevant PM to schedule a changelog post.

ParkMyCar

Thanks for working through this @ggevay!

As far as I understand, as proposed we would have 4 different tables:

mz_statement_lifecycle_history - tracks the lifecycle of a statement from receiving the bytes to parsing to eventually binding and executing if the extended query protocol is used. You can have multiple events here for a single statement_id.
mz_prepared_statement_history - Tracks all of the prepared statements, the primary key of this relation would be a "statement_id". (essentially this is a rollup of mz_statement_lifecycle_history?)
mz_statement_execution_lifecycle_history - tracks the lifecycle of a statement as it's being executed. You can have multiple events here for a single execution_id.
mz_statement_execution_history - Tracks every instance of a statement execution, the primary key of this relation would be an "execution_id" and it would join against a "statement_id" from mz_prepared_statement_history. (essentially this is a rollup of mz_statement_execution_lifecycle_history?)

This this is the case, it makes sense to me! Although I propose a slightly different naming scheme:

mz_statement_preparation_lifecycle_history
mz_statement_preparation_rollup_history
mz_statement_execution_lifecycle_history
mz_statement_execution_rollup_history

I know the _rollup_ tables aren't exactly rollups because they also include extra metadata, but conceptually that seems to be the purpose for the tables. What do you think?

doc/developer/design/20250515_query_lifecycle_extensions.md

ParkMyCar · 2025-05-15T14:03:47Z

doc/developer/design/20250515_query_lifecycle_extensions.md

+- `FrontendMessage::Bind` arrives in `protocol.rs`, to which
+    - `mz_statement_lifecycle_history` would get the following events:
+        - `BindReceived`
+        - `BindComplete` (whose timestamp would probably be very close to `BindReceived`, but later we might do more interesting things at bind time, in which case binding will take more time)
+- `FrontendMessage::Execute` arrives in `protocol.rs`, to which
+    - `mz_statement_lifecycle_history` would get the following events:
+        - `ExecuteReceived`
+        - `ExecuteFinished`


I'm a little thrown off by recording these stages so explicitly under the assumption they may change in the future. But given this just mirrors the Postgres wire protocol which will never change, I think the more info we have the better.

Yes, this is intended for the long term.

doc/developer/design/20250515_query_lifecycle_extensions.md

ggevay · 2025-05-15T15:03:42Z

Thanks for the review!

Although I propose a slightly different naming scheme:

I like your new names! They are more structured. This will mean slightly more renaming of existing tables than my original proposal, but I think it's worth it to make it easier to understand these 4 tables.

The only thing is rollup sounds a bit out of place for me. "Rollup" would suggest to me that it's the same info as in the other table but just grouped in some way, when in reality it has extra metadata as you pointed out. What do you think about saying actually metadata instead of rolloup in the table names?

ParkMyCar · 2025-05-15T18:00:45Z

I like your new names! They are more structured. This will mean slightly more renaming of existing tables than my original proposal, but I think it's worth it to make it easier to understand these 4 tables.

Great! Yeah definitely is a bit more work, but thank you for taking it on. As-is I get lost with how all of the existing tables are connected 😅

The only thing is rollup sounds a bit out of place for me. "Rollup" would suggest to me that it's the same info as in the other table but just grouped in some way, when in reality it has extra metadata as you pointed out. What do you think about saying actually metadata instead of rolloup in the table names?

"metadata" works for me, what actually what do you think about _log as the entire suffix? So we'd have:

mz_statement_preparation_lifecycle_history
mz_statement_preparation_log
mz_statement_execution_lifecycle_history
mz_statement_execution_log

This is definitely getting into bikeshed territory and mostly what I'm concerned about is the new structure. So whatever suffixes you decide to use work for me :)

teskje · 2025-05-16T08:52:21Z

doc/developer/design/20250515_query_lifecycle_extensions.md

+
+- There is `mz_prepared_statement_history`, which records statements submitted through any of 1., 2., 3., or 4.
+- There is `mz_statement_execution_history`. For 1. and 2., rows here are in 1-to-1 correspondence with rows in `mz_prepared_statement_history`. For 3. and 4., multiple rows here correspond to 1 row in `mz_prepared_statement_history` when the same prepared statement is executed multiple times.
+- There is `mz_statement_lifecycle_history`. Confusingly, this has a `statement_id` column, which joins with `mz_statement_execution_history.id`. So, the name of both the table and the column suggest that rows here correspond with rows in `mz_prepared_statement_history`, but actually they correspond with `mz_statement_execution_history`.


the name of both the table and the column suggest that rows here correspond with rows in mz_prepared_statement_history

Wait, is "prepared statement" the same as "statement"? I thought a prepared statement is a template for a statement. But if that's true then I don't think statement_id suggests that it should be joined with mz_prepared_statement_history.

teskje · 2025-05-16T08:53:33Z

doc/developer/design/20250515_query_lifecycle_extensions.md

+So, what I would propose is:
+- For statement _execution_ lifecycle events: We rename the current `mz_statement_lifecycle_history` to `mz_statement_execution_lifecycle_history` (which should have been its original name). Additionally, we should rename its `statement_id` field to `execution_id` (which, again, should have been its original name, because it already joins with `mz_statement_execution_history.id` and _not_ with `mz_prepared_statement_history.id`). This is in `mz_internal`, so it should be ok from a user perspective. However, the Console is using these, so I guess there will need to be an interim period where the Console would have to conditionally use the old or the new names based on the Materialize version. @SangJunBak [says](https://github.com/MaterializeInc/database-issues/issues/9140#issuecomment-2881827627) that this would be ok.
+    - We add a new event to `mz_statement_execution_lifecycle_history`: `ExecutionMessageReceived` (at the `conn.recv()` in `advance_ready` in `protocol.rs`, roughly corresponding to last byte received): Either when we get an `Execute` in 3. or 4., or when one or multiple queries are submitted through 1. or 2. In case of 2., the same lifecycle event would be copied to multiple statements.
+- For statement lifecycle events: We add a new `mz_statement_lifecycle_history` table, which has the same schema as the current `mz_statement_lifecycle_history`, but its `statement_id` joins with `mz_prepared_statement_history.id`. It will record the following events:


This relates to my confusion about the difference between "statement" and "prepared statement" above: Should the table be named mz_prepared_statement_lifecycle_history and the foreign key be named prepared_statement_id instead?

teskje · 2025-05-16T09:03:27Z

doc/developer/design/20250515_query_lifecycle_extensions.md

+
+Note that if a user is interested only in queries submitted through 1. or 2., and they are not interested in when the parsing and other preparation ended, it will be enough to look at `mz_statement_execution_lifecycle_history` (and optionally join it with `mz_statement_execution_history` and `mz_prepared_statement_history` to get more info about the execution and/or statement). They will still be able to get the end-to-end time, because `ExecutionMessageReceived` will be part of `mz_statement_execution_lifecycle_history`.
+
+If a user is interested also in prepared statements, i.e., 3. or 4, then they would need to look at also `mz_statement_lifecycle_history. In this case, a 4-way join between `mz_statement_lifecycle_history`, `mz_statement_execution_lifecycle_history`, `mz_statement_execution_history`, and `mz_prepared_statement_history` would be needed.


Have you considered not having the additional mz_statement_lifecycle_history table and instead denormalizing the prepared statement events into mz_statement_execution_lifecycle_history? It seems like this would simplify the implementation, reduce the catalog surface, and make getting at the data easier for the Console. At least I think the Console wants to show timings for statement executions only and not for prepared statements?

A benefit of normalizing would be that safe some space by deduplicating events when the same prepared statement is executed a large amount of times. Do we have any way of knowing how frequently that happens?

teskje · 2025-05-16T09:12:36Z

doc/developer/design/20250515_query_lifecycle_extensions.md

+
+## Alternatives
+
+One could imagine trying to keep things simple by just quickly implementing the events only for statements that are not prepared statements. however, it turns out that [a big user](https://github.com/MaterializeInc/accounts/issues/3) is actually using prepared statements, because PowerBI always uses the "Extended Query" flow of pgwire. And since having visibility into that user's PowerBI query latency is one of the main motivations of this work, we should do the right thing and support prepared statements.


Another alternative is to consider replacing the entire statement logging mechanism. The current one has proven to not scale well and is causing incidents from time to time, so we'll have to rethink it eventually. This will be a major piece of work, possibly involving the integration of some external component, so I can understand if you consider this out of scope. But there is a real risk that the query lifecycle will run into the same scalability issues the statement log experiences once we start leaning more on it.

ggevay requested a review from ParkMyCar May 15, 2025 12:37

ggevay added the A-ADAPTER Topics related to the ADAPTER layer label May 15, 2025

ggevay force-pushed the query-lifecycle-design branch 2 times, most recently from 1370d6c to d3a52ab Compare May 15, 2025 13:07

ParkMyCar reviewed May 15, 2025

View reviewed changes

ggevay added 2 commits May 15, 2025 18:40

Design doc on exending the query lifecycle tables

5c44c3d

Address comments on lifecycle design doc

66816cf

ggevay force-pushed the query-lifecycle-design branch from c1c231b to 66816cf Compare May 15, 2025 16:40

ggevay mentioned this pull request May 15, 2025

Add OptimizationBegan query lifecycle event #32508

Draft

5 tasks

ParkMyCar approved these changes May 15, 2025

View reviewed changes

teskje reviewed May 16, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Design doc on extending the query lifecycle tables #32504

Design doc on extending the query lifecycle tables #32504

ggevay commented May 15, 2025 •

edited

Loading

Uh oh!

ParkMyCar left a comment

Uh oh!

Uh oh!

ParkMyCar May 15, 2025

Uh oh!

ggevay May 15, 2025

Uh oh!

Uh oh!

Uh oh!

ggevay commented May 15, 2025 •

edited

Loading

Uh oh!

ParkMyCar commented May 15, 2025

Uh oh!

teskje May 16, 2025

Uh oh!

teskje May 16, 2025

Uh oh!

teskje May 16, 2025

Uh oh!

teskje May 16, 2025

Uh oh!

Uh oh!


		Note that if a user is interested only in queries submitted through 1. or 2., and they are not interested in when the parsing and other preparation ended, it will be enough to look at `mz_statement_execution_lifecycle_history` (and optionally join it with `mz_statement_execution_history` and `mz_prepared_statement_history` to get more info about the execution and/or statement). They will still be able to get the end-to-end time, because `ExecutionMessageReceived` will be part of `mz_statement_execution_lifecycle_history`.

		If a user is interested also in prepared statements, i.e., 3. or 4, then they would need to look at also `mz_statement_lifecycle_history. In this case, a 4-way join between `mz_statement_lifecycle_history`, `mz_statement_execution_lifecycle_history`, `mz_statement_execution_history`, and `mz_prepared_statement_history` would be needed.


		## Alternatives

		One could imagine trying to keep things simple by just quickly implementing the events only for statements that are not prepared statements. however, it turns out that [a big user](https://github.com/MaterializeInc/accounts/issues/3) is actually using prepared statements, because PowerBI always uses the "Extended Query" flow of pgwire. And since having visibility into that user's PowerBI query latency is one of the main motivations of this work, we should do the right thing and support prepared statements.

Design doc on extending the query lifecycle tables #32504

Are you sure you want to change the base?

Design doc on extending the query lifecycle tables #32504

Conversation

ggevay commented May 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation

Tips for reviewer

Checklist

Uh oh!

ParkMyCar left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

ParkMyCar May 15, 2025

Choose a reason for hiding this comment

Uh oh!

ggevay May 15, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

ggevay commented May 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ParkMyCar commented May 15, 2025

Uh oh!

teskje May 16, 2025

Choose a reason for hiding this comment

Uh oh!

teskje May 16, 2025

Choose a reason for hiding this comment

Uh oh!

teskje May 16, 2025

Choose a reason for hiding this comment

Uh oh!

teskje May 16, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

ggevay commented May 15, 2025 •

edited

Loading

ggevay commented May 15, 2025 •

edited

Loading