feat: add snapshot chunking mode overrides and sparse PK safeguards #66

mertbilgic · 2025-12-28T09:38:56Z

No description provided.

mertbilgic · 2025-12-28T09:47:34Z

pq/snapshot/coordinator.go

+
+	// Calculate sparsity ratio to detect sparse primary key distributions
+	// Example: Snowflake IDs where minValue=1, maxValue=7234567890123456789, but only 1000 rows
+	sparsityRatio := float64(totalRange) / float64(rowCount)


In large distributed ID systems (e.g., snowflake/timestamp IDs), it’s common to switch from offset/range chunking to keyset pagination based on a “gap density” (sparsity) check.

mertbilgic · 2025-12-28T09:51:32Z

config/config.go

+type SnapshotChunkingMode string
+
+const (
+	SnapshotChunkingModeAuto   SnapshotChunkingMode = "auto"
+	SnapshotChunkingModeRange  SnapshotChunkingMode = "range"
+	SnapshotChunkingModeKeyset SnapshotChunkingMode = "keyset"
+	SnapshotChunkingModeOffset SnapshotChunkingMode = "offset"
+)


The user might not want the decision made according to the tree. I added this so that, if possible, we can choose a moderator and direct them to the type of snapshot they want.

mertbilgic · 2025-12-28T09:52:55Z

pq/snapshot/coordinator.go

+// ChunkResult contains the result of processing a chunk
+type ChunkResult struct {
+	RowCount int64
+	LastPK   *int64 // Last processed primary key value (for keyset pagination)
+}
+


In keyset chunks, the starting cursor for the next chunk is reliably determined. That's why we needed this model.

mertbilgic · 2025-12-28T12:02:06Z

pq/snapshot/coordinator.go

-			"SELECT * FROM %s.%s WHERE %s >= %d AND %s <= %d ORDER BY %s LIMIT %d",
+			`SELECT * FROM "%s"."%s" WHERE "%s" >= %d AND "%s" <= %d ORDER BY %s LIMIT %d`,


I have updated the SQL generation logic to wrap column names in double quotes ("column_name"). This ensures our queries are compliant with the PostgreSQL Lexical Structure for "Quoted Identifiers."

mertbilgic · 2025-12-28T14:53:31Z

pq/snapshot/worker.go

+		    heartbeat_at = '%s',
+		    -- Update range_start from previous chunk's last_pk for sequential keyset chunks
+		    range_start = CASE 
+		        WHEN c.range_end IS NULL AND c.range_start < 0 AND c.chunk_index > 0 
+		        THEN COALESCE((SELECT last_pk FROM prev_chunk_info), c.range_start)
+		        ELSE c.range_start 
+		    END


The range_end IS NULL condition is scoped to sequential keyset only; range, offset, and parallel keyset chunks are unaffected.

mertbilgic · 2025-12-29T09:49:25Z

pq/snapshot/coordinator.go

+	// Use NTILE to divide rows into equal groups and get boundary values
+	// Use quoted identifiers to handle special characters in table/column names
+	query := fmt.Sprintf(`
+		WITH chunk_boundaries AS (
+			SELECT 
+				"%s" as pk_value,
+				NTILE(%d) OVER (ORDER BY "%s") as chunk_num
+			FROM "%s"."%s"
+		)
+		SELECT 
+			chunk_num - 1 as chunk_index,
+			MIN(pk_value) as range_start,
+			MAX(pk_value) as range_end
+		FROM chunk_boundaries
+		GROUP BY chunk_num
+		ORDER BY chunk_num
+	`, pkColumn, numChunks, pkColumn, table.Schema, table.Name)


We spoke with @Abdulsametileri . There is a performance problem here at the moment.

Abdulsametileri · 2026-01-04T20:21:32Z

let's use for ctid partitioning #67

{
					Name:                      "yourTable",
					Schema:                    "yourSchema",
					SnapshotPartitionStrategy: publication.SnapshotPartitionStrategyCTIDBlock,
				},

Abdulsametileri · 2026-01-08T06:18:30Z

we added ctid and fixed with it for now, ı am closing this, in the future maybe it needed

feat: add snapshot chunking mode overrides and sparse PK safeguards

273287c

mertbilgic commented Dec 28, 2025

View reviewed changes

mertbilgic commented Dec 29, 2025

View reviewed changes

Abdulsametileri closed this Jan 8, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: add snapshot chunking mode overrides and sparse PK safeguards #66

feat: add snapshot chunking mode overrides and sparse PK safeguards #66

Uh oh!

mertbilgic commented Dec 28, 2025

Uh oh!

mertbilgic Dec 28, 2025

Uh oh!

mertbilgic Dec 28, 2025

Uh oh!

mertbilgic Dec 28, 2025

Uh oh!

mertbilgic Dec 28, 2025

Uh oh!

mertbilgic Dec 28, 2025

Uh oh!

mertbilgic Dec 29, 2025

Uh oh!

Abdulsametileri commented Jan 4, 2026 •

edited

Loading

Uh oh!

Abdulsametileri commented Jan 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		"SELECT * FROM %s.%s WHERE %s >= %d AND %s <= %d ORDER BY %s LIMIT %d",
		`SELECT * FROM "%s"."%s" WHERE "%s" >= %d AND "%s" <= %d ORDER BY %s LIMIT %d`,

feat: add snapshot chunking mode overrides and sparse PK safeguards #66

feat: add snapshot chunking mode overrides and sparse PK safeguards #66

Uh oh!

Conversation

mertbilgic commented Dec 28, 2025

Uh oh!

mertbilgic Dec 28, 2025

Choose a reason for hiding this comment

Uh oh!

mertbilgic Dec 28, 2025

Choose a reason for hiding this comment

Uh oh!

mertbilgic Dec 28, 2025

Choose a reason for hiding this comment

Uh oh!

mertbilgic Dec 28, 2025

Choose a reason for hiding this comment

Uh oh!

mertbilgic Dec 28, 2025

Choose a reason for hiding this comment

Uh oh!

mertbilgic Dec 29, 2025

Choose a reason for hiding this comment

Uh oh!

Abdulsametileri commented Jan 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Abdulsametileri commented Jan 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Abdulsametileri commented Jan 4, 2026 •

edited

Loading