-
Notifications
You must be signed in to change notification settings - Fork 463
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
pageserver: improve flush upload queue parallelism #10096
Comments
In theory if we put multiple upload tasks into the queue, they will be fired in parallel. neon/pageserver/src/tenant/remote_timeline_client.rs Lines 1813 to 1831 in ef233e9
I think compaction already utilized this code path? |
Yes. The problem is that when we flush a delta layer, we schedule both an We want to flush a batch of layers and schedule neon/pageserver/src/tenant/timeline.rs Lines 3932 to 3936 in c42c28b
neon/pageserver/src/tenant/timeline.rs Lines 3848 to 3850 in c42c28b
neon/pageserver/src/tenant/timeline.rs Lines 3809 to 3824 in c42c28b
|
There's a prototype PR for deferred index uploads during flush in #10144. However, this approach has a fair number of issues -- we have to wait for "some time" in case further layers are flushed, and this both breaks caller expectations that index uploads are scheduled immediately, and it can still cause index barriers if the previous index isn't uploaded before the next layer comes in. Instead, let's try to reorder layer uploads ahead of index uploads in the upload queue: neon/pageserver/src/tenant/remote_timeline_client.rs Lines 1828 to 1836 in 3d30a7a
A few things to keep in mind:
|
This reverts commit f3ecd5d. It is [suspected](https://neondb.slack.com/archives/C033RQ5SPDH/p1735907405716759) to have caused significant read amplification in the [ingest benchmark](https://neonprod.grafana.net/d/de3mupf4g68e8e/perf-test3a-ingest-benchmark?orgId=1&from=now-30d&to=now&timezone=utc&var-new_project_endpoint_id=ep-solitary-sun-w22bmut6&var-large_tenant_endpoint_id=ep-holy-bread-w203krzs) (specifically during index creation). We will revisit an intermediate improvement here to unblock [upload parallelism](#10096) before properly addressing [compaction backpressure](#8390).
This reverts commit f3ecd5d. It is [suspected](https://neondb.slack.com/archives/C033RQ5SPDH/p1735907405716759) to have caused significant read amplification in the [ingest benchmark](https://neonprod.grafana.net/d/de3mupf4g68e8e/perf-test3a-ingest-benchmark?orgId=1&from=now-30d&to=now&timezone=utc&var-new_project_endpoint_id=ep-solitary-sun-w22bmut6&var-large_tenant_endpoint_id=ep-holy-bread-w203krzs) (specifically during index creation). We will revisit an intermediate improvement here to unblock [upload parallelism](#10096) before properly addressing [compaction backpressure](#8390).
## Problem It's not legal to modify layers that are referenced by the current layer index. Assert this in the upload queue, as preparation for upload queue reordering. Touches #10096. ## Summary of changes Add a debug assertion that the upload queue does not modify layers referenced by the current index. I could be convinced that this should be a plain assertion, but will be conservative for now.
Currently, when flushing delta layers, we upload the layer and then update the index. But index updates act as an upload queue parallelism barrier. This means that we're effectively uploading layers one at a time.
Instead, we should flush a batch of layer files (something like 100-1000 MB) for each index update, allowing us to upload these in parallel.
Also note that the flush loop backpressure from #8550 will also prevent parallelism. We'll need to remove this first, see #10095.
The text was updated successfully, but these errors were encountered: