Skip to content

Snapshot FAQ update #32557

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
17 changes: 15 additions & 2 deletions doc/user/content/ingest-data/troubleshooting.md
Original file line number Diff line number Diff line change
Expand Up @@ -61,8 +61,15 @@ cannot serve queries. That is, queries issued to the snapshotting source (and
its subsources) will return after the snapshotting completes (unless the user
breaks out of the query).

Snapshotting can take between a few minutes to several hours, depending on the
size of your dataset and the [size of your ingestion cluster](/sql/create-cluster/#size).
Snapshotting can take anywhere from a few minutes to several hours, depending on the size of your dataset,
the upstream database, the number of tables (more tables can be parallelized in Postgres), and the [size of your ingestion cluster](/sql/create-cluster/#size).

We've observed the following approximate snapshot rates from PostgreSQL:
| Cluster Size | Snapshot Rate |
|--------------|---------------|
| 25 cc | ~20 MB/s |
| 100 cc | ~50 MB/s |
| 800 cc | ~200 MB/s |

To determine whether your source has completed ingesting the initial snapshot,
you can query the [`mz_source_statistics`](/sql/system-catalog/mz_internal/#mz_source_statistics)
Expand All @@ -82,6 +89,12 @@ components of the snapshot.
Even if your source has not yet committed its initial snapshot, you can still
monitor its progress. See [How do I monitor source ingestion progress?](#how-do-i-monitor-source-ingestion-progress).

## How do I speed up the snapshotting process?

Scale up the cluster used for the snapshot, then scale it back down once the snapshot completes. See [Use a larger cluster for upsert source snapshotting](https://materialize.com/docs/self-managed/v25.1/ingest-data/#use-a-larger-cluster-for-upsert-source-snapshotting).



## How do I monitor source ingestion progress?

Repeatedly query the
Expand Down