|
8 | 8 |
|
9 | 9 | `EXPLAIN ANALYZE` uses [`mz_introspection`](https://materialize.com/docs/sql/system-catalog/mz_introspection/) sources to report on the performance of indexes and materialized views.
|
10 | 10 |
|
| 11 | +When optimizing a query, it helps to be able to attribute 'cost' to its parts, |
| 12 | +starting with how much time is spent computing in each part overall. For example, Materialize |
| 13 | +reports the time spent in each _dataflow operator_ in |
| 14 | +[`mz_introspection.mz_compute_operator_durations_histogram`](/sql/system-catalog/mz_introspection/#mz_compute_operator_durations_histogram). |
| 15 | +Using `EXPLAIN ANALYZE`, we can attribute the time spent in each operator to the higher-level, more |
| 16 | +intelligible LIR operators. |
| 17 | + |
11 | 18 | {{< warning >}}
|
12 | 19 | `EXPLAIN` is not part of Materialize's stable interface and is not subject to
|
13 | 20 | our backwards compatibility guarantee. The syntax and output of `EXPLAIN` may
|
@@ -52,7 +59,174 @@ When running `EXPLAIN ANALYZE` for `CPU` or `MEMORY`, you can specify `WITH SKEW
|
52 | 59 |
|
53 | 60 | Under the hood, `EXPLAIN ANALYZE` runs SQL queries that correlate [`mz_introspection` performance information](https://materialize.com/docs/sql/system-catalog/mz_introspection/) with the LIR operators in [`mz_introspection.mz_lir_mapping`](../../sql/system-catalog/mz_introspection/#mz_lir_mapping).
|
54 | 61 |
|
55 |
| -<!-- TODO add example or two --> |
| 62 | +The attribution examples in this |
| 63 | +section reference the `wins_by_item` index (and the underlying `winning_bids` |
| 64 | +view) from the [quickstart |
| 65 | +guide](/get-started/quickstart/#step-2-create-the-source): |
| 66 | + |
| 67 | +```sql |
| 68 | +CREATE SOURCE auction_house |
| 69 | +FROM LOAD GENERATOR AUCTION |
| 70 | +(TICK INTERVAL '1s', AS OF 100000) |
| 71 | +FOR ALL TABLES; |
| 72 | + |
| 73 | +CREATE VIEW winning_bids AS |
| 74 | + SELECT DISTINCT ON (a.id) b.*, a.item, a.seller |
| 75 | + FROM auctions AS a |
| 76 | + JOIN bids AS b |
| 77 | + ON a.id = b.auction_id |
| 78 | + WHERE b.bid_time < a.end_time |
| 79 | + AND mz_now() >= a.end_time |
| 80 | + ORDER BY a.id, b.amount DESC, b.bid_time, b.buyer; |
| 81 | + |
| 82 | +CREATE INDEX wins_by_item ON winning_bids (item); |
| 83 | +``` |
| 84 | + |
| 85 | +### `EXPLAIN ANALYZE MEMORY` |
| 86 | + |
| 87 | +When we run `EXPLAIN ANALYZE MEMORY FOR INDEX wins_by_item`, we get back a report on each operator in the dataflow for `wins_by_item`, annotated with memory usage and the number of records: |
| 88 | + |
| 89 | +| operator | total_memory | total_records | |
| 90 | +| :------------------------------- | -----------: | ------------: | |
| 91 | +| Arrange 7 | 1116 kB | 39976 | |
| 92 | +| Get::PassArrangements u274 | null | null | |
| 93 | +| TopK::Basic 5 | 100 MB | 1899733 | |
| 94 | +| Join::Differential 2 » 4 | null | null | |
| 95 | +| Arrange 3 | 5672 kB | 220375 | |
| 96 | +| Get::PassArrangements u271 | null | null | |
| 97 | +| Arrange 1 | 3501 kB | 188785 | |
| 98 | +| Get::Collection u270 | null | null | |
| 99 | + |
| 100 | +The results show the [`TopK`](/transform-data/idiomatic-materialize-sql/top-k/) is overwhelmingly responsible for memory usage. |
| 101 | + |
| 102 | + |
| 103 | +### `EXPLAIN ANALYZE CPU` |
| 104 | + |
| 105 | +When we run `EXPLAIN ANALYZE CPU FOR INDEX wins_by_item`, we get back a report on each operator in the dataflow for `wins_by_item`, annotated with total time spent in each operator (not inclusive of its child operators): |
| 106 | + |
| 107 | +| operator | total_elapsed | |
| 108 | +| :------------------------------- | --------------: | |
| 109 | +| Arrange 7 | 00:00:03.14266 | |
| 110 | +| Get::PassArrangements u274 | null | |
| 111 | +| TopK::Basic 5 | 00:00:44.079592 | |
| 112 | +| Join::Differential 2 » 4 | 00:00:06.372705 | |
| 113 | +| Arrange 3 | 00:00:21.49465 | |
| 114 | +| Get::PassArrangements u271 | 00:00:00.066236 | |
| 115 | +| Arrange 1 | 00:00:11.212331 | |
| 116 | +| Get::Collection u270 | 00:00:00.190449 | |
| 117 | + |
| 118 | +We can see CPU and memory usage simultaneously by running `EXPLAIN ANALYZE CPU, MEMORY FOR INDEX wins_by_item`: |
| 119 | + |
| 120 | +| operator | total_elapsed | total_memory | total_records | |
| 121 | +| :------------------------------- | --------------: | -----------: | ------------: | |
| 122 | +| Arrange 7 | 00:00:03.151386 | 1234 kB | 42359 | |
| 123 | +| Get::PassArrangements u274 | null | null | | |
| 124 | +| TopK::Basic 5 | 00:00:44.347959 | 105 MB | 2013547 | |
| 125 | +| Join::Differential 2 » 4 | 00:00:06.389385 | null | null | |
| 126 | +| Arrange 3 | 00:00:21.558754 | 5431 kB | 233533 | |
| 127 | +| Get::PassArrangements u271 | 00:00:00.06644 | null | null | |
| 128 | +| Arrange 1 | 00:00:11.246103 | 3546 kB | 191168 | |
| 129 | +| Get::Collection u270 | 00:00:00.190935 | null | null | |
| 130 | + |
| 131 | +The order of `CPU` and `MEMORY` in the statement determines the order of the output columns. |
| 132 | + |
| 133 | +### `EXPLAIN ANALYZE ... WITH SKEW` |
| 134 | + |
| 135 | +[Worker skew](/transform-data/dataflow-troubleshooting/#is-work-distributed-equally-across-workers) occurs when your data do not end up getting evenly |
| 136 | +partitioned between workers. |
| 137 | +Worker skew can only happen when your cluster has more than one worker. |
| 138 | +You can query |
| 139 | +[`mz_catalog.mz_cluster_replica_sizes`](/sql/system-catalog/mz_catalog/#mz_cluster_replica_sizes) |
| 140 | +to determine how many workers a given cluster size has; in our example, there are 2 workers. |
| 141 | +Extreme cases of skew can cause serious performance issues; `EXPLAIN ANALYZE` can help you identify this scenario. |
| 142 | + |
| 143 | +You can explain `MEMORY`, `CPU`, or both with `WITH SKEW`, which adds per worker and average worker performance numbers for each operator, along with each worker's ratio compared to the average. Here is an example of `EXPLAIN ANALYZE MEMORY WITH SKEW FOR INDEX wins_by_item`: |
| 144 | + |
| 145 | +| operator | worker_id | memory_ratio | worker_memory | avg_memory | total_memory | records_ratio | worker_records | avg_records | total_records | |
| 146 | +| :------------------------------- | --------: | -----------: | ------------: | ---------: | -----------: | ------------: | -------------: | ----------: | ------------: | |
| 147 | +| Arrange 7 | 0 | 1.24 | 768 kB | 619 kB | 1238 kB | 1.2 | 25485 | 21246 | 42492 | |
| 148 | +| Arrange 7 | 1 | 0.76 | 470 kB | 619 kB | 1238 kB | 0.8 | 17007 | 21246 | 42492 | |
| 149 | +| Get::PassArrangements u274 | null | null | null | null | null | null | null | null | null | |
| 150 | +| TopK::Basic 5 | 0 | 1 | 53 MB | 53 MB | 105 MB | 1 | 1011624 | 1010259.5 | 2020519 | |
| 151 | +| TopK::Basic 5 | 1 | 1 | 52 MB | 53 MB | 105 MB | 1 | 1008895 | 1010259.5 | 2020519 | |
| 152 | +| Join::Differential 2 » 4 | null | null | null | null | null | null | null | null | null | |
| 153 | +| Arrange 3 | 0 | 1 | 2726 kB | 2724 kB | 5448 kB | 1 | 117297 | 117167.5 | 234335 | |
| 154 | +| Arrange 3 | 1 | 1 | 2722 kB | 2724 kB | 5448 kB | 1 | 117038 | 117167.5 | 234335 | |
| 155 | +| Get::PassArrangements u271 | null | null | null | null | null | null | null | null | null | |
| 156 | +| Arrange 1 | 0 | 1 | 1779 kB | 1778 kB | 3556 kB | 1 | 95955 | 95750 | 191500 | |
| 157 | +| Arrange 1 | 1 | 1 | 1778 kB | 1778 kB | 3556 kB | 1 | 95545 | 95750 | 191500 | |
| 158 | +| Get::Collection u270 | null | null | null | null | null | null | null | null | null | |
| 159 | + |
| 160 | +The `ratio` column tells you whether a worker is particularly over- or |
| 161 | +under-loaded: |
| 162 | + |
| 163 | +- a `ratio` below 1 indicates a worker doing a below average amount of work. |
| 164 | + |
| 165 | +- a `ratio` above 1 indicates a worker doing an above average amount of work. |
| 166 | + |
| 167 | +While there will always be some amount of variation, very high ratios indicate a |
| 168 | +skewed workload. |
| 169 | +Here the memory ratios are close to 1, indicating there is very little worker skew. |
| 170 | + |
| 171 | +### `EXPLAIN ANALYZE HINTS` |
| 172 | + |
| 173 | +The [`mz_introspection.mz_expected_group_size_advice`](/sql/system-catalog/mz_introspection/#mz_expected_group_size_advice) introspection source offers hints on sizing `TopK` operators. `EXPLAIN ANALYZE HINTS` can annotate your query with these hints, which can be helpful for identifying which `TopK` should get which hint. Each `TopK` operator will have an [associated `DISTINCT ON INPUT GROUP SIZE` |
| 174 | +query hint](/transform-data/idiomatic-materialize-sql/top-k/#query-hints-1): |
| 175 | +Here is an example of `EXPLAIN ANALYZE HINTS FOR INDEX wins_by_item` (which has only one `TopK`): |
| 176 | + |
| 177 | +| operator | levels | to_cut | hint | savings | |
| 178 | +| :------------------------------- | -----: | -----: | ----: | ------: | |
| 179 | +| Arrange 7 | null | null | null | null | |
| 180 | +| Get::PassArrangements u274 | null | null | null | null | |
| 181 | +| TopK::Basic 5 | 8 | 6 | 255.0 | 75 MB | |
| 182 | +| Join::Differential 2 » 4 | null | null | null | null | |
| 183 | +| Arrange 3 | null | null | null | null | |
| 184 | +| Get::PassArrangements u271 | null | null | null | null | |
| 185 | +| Arrange 1 | null | null | null | null | |
| 186 | +| Get::Collection u270 | null | null | null | null | |
| 187 | + |
| 188 | +Here, the hinted `DISTINCT ON INPUT GROUP SIZE` is `255.0`. We can re-create our view and index using the hint as follows: |
| 189 | + |
| 190 | +```sql |
| 191 | +DROP VIEW winning_bids CASCADE; |
| 192 | + |
| 193 | +CREATE VIEW winning_bids AS |
| 194 | + SELECT DISTINCT ON (a.id) b.*, a.item, a.seller |
| 195 | + FROM auctions AS a |
| 196 | + JOIN bids AS b |
| 197 | + ON a.id = b.auction_id |
| 198 | + WHERE b.bid_time < a.end_time |
| 199 | + AND mz_now() >= a.end_time |
| 200 | + OPTIONS (DISTINCT ON INPUT GROUP SIZE = 255) -- use hint! |
| 201 | + ORDER BY a.id, |
| 202 | + b.amount DESC, |
| 203 | + b.bid_time, |
| 204 | + b.buyer; |
| 205 | + |
| 206 | +CREATE INDEX wins_by_item ON winning_bids (item); |
| 207 | +``` |
| 208 | + |
| 209 | +Re-running the `TopK`-hints query will show only `null` hints; there are no |
| 210 | +hints because our `TopK` is now appropriately sized. But if we re-run our [query |
| 211 | +for attributing memory usage](#attributing-memory-usage), we can see that our |
| 212 | +`TopK` operator uses a third of the 100MB of memory it was using before: |
| 213 | + |
| 214 | +| operator | total_memory | total_records | |
| 215 | +| -------------------------------- | ------------ | ------------- | |
| 216 | +| Arrange 7 | 1093 kB | 42720 | |
| 217 | +| Get::PassArrangements u286 | null | null | |
| 218 | +| TopK::Basic 5 | 30 MB | 625638 | |
| 219 | +| Join::Differential 2 » 4 | null | null | |
| 220 | +| Arrange 3 | 5447 kB | 235570 | |
| 221 | +| Get::PassArrangements u271 | null | null | |
| 222 | +| Arrange 1 | 3485 kB | 191730 | |
| 223 | +| Get::Collection u270 | null | null | |
| 224 | + |
| 225 | +### `EXPLAIN ANALYZE ... AS SQL` |
| 226 | + |
| 227 | +Under the hood, `EXPLAIN ANALYZE` issues SQL queries against [`mz_introspection`](https://materialize.com/docs/sql/system-catalog/mz_introspection/) sources. |
| 228 | +You can append `AS SQL` to any `EXPLAIN ANALYZE` statement to see the SQL that would be run (without running it). |
| 229 | +You can then customize this SQL to report finer grained or other information. |
56 | 230 |
|
57 | 231 | ## Privileges
|
58 | 232 |
|
|
0 commit comments