Do not add a partition spec if it already exists #79

sfc-gh-okalaci · 2025-11-21T12:03:11Z

Both Spark and Polaris follows this. Before this commit, we blindly added partition specs, even if the same spec existed before. Now, we are switching to a more proper way: if the same spec exists in the table earlier, we simply use that instead of adding one more spec.

This is especially useful for #68, where Polaris throws an error if we try to send a partition spec that already exists. Here the same spec means a spec that has the exact same fields with an existing spec in the table.

To give an example, previously we created 3 specs for the following, now 2:

ALTER TABLE t OPTION (ADD partition_by='bucket(10,a)');
..
ALTER TABLE t OPTION (ADD partition_by='bucket(20,a)');
..
-- back to 10, this does not generate a new spec anymore
ALTER TABLE t OPTION (ADD partition_by='bucket(10,a)');

On Polaris, before this commit, we'd get the following, given Polaris thinks this spec already exists:

alter table t (set partition_by 'bucket(10,a)');
WARNING:  HTTP request failed (HTTP 400)
DETAIL:  Cannot set last added spec: no spec has been added
HINT:  The rest catalog returned error type: ValidationException

sfc-gh-abozkurt · 2025-11-21T15:45:44Z

pg_lake_table/include/pg_lake/fdw/partition_transform.h

 										  Datum columnValue, bool isNull,
 										  size_t *bucketSize);
 extern List *CurrentPartitionTransformList(Oid relationId);
+extern IcebergPartitionSpec * PartitionAlreadyExistsInTable(Oid relationId, List *partitionTransforms);


nit: GetPartitionSpecIfAlreadyExist

sfc-gh-abozkurt · 2025-11-21T15:48:32Z

pg_lake_table/src/fdw/partition_transform.c

+		 * column name, the transform type and the parameters to
+		 * bucket/truncate transforms.
+		 */
+		if (strcasecmp(specField->name, transform->partitionFieldName) != 0)


just a note: partition_field_id is not correct (we always increment it) to check here, right? This is why we check field names instead.

Yeah, good point. I should have documented better, I'll update the comments to reflect this.

In a deep call chain, Iceberg uses this function to check if a new spec is the same as an existing spec. If so, it returns the existing spec, rather than adding a new spec.

We are essentially following that logic.

sfc-gh-abozkurt · 2025-11-21T15:52:45Z

pg_lake_table/tests/pytests/test_iceberg_partitions_misc.py

+    )
+    assert (
+        len(table_partition_specs(pg_conn, "test_re_set_partition_fields", "tbl")) == 5
+    )


Can we add the same spec after renaming a column in the spec?

uh nvm, we do not allow altering columns in any spec.

Yeah, that's a difficult limitation to remove anytime soon.

sfc-gh-abozkurt

LGTM. left a nit.

Both Spark and Polaris follows this. Before this commit, we blindly added partition specs, even if the same spec existed before. Now, we are switching to a more proper way: if the same spec exists in the table earlier, we simply use that instead of adding one more spec. This is especially useful for #68, where Polaris throws an error if we try to send a partition spec that already exists. Here `the same spec` means a spec that has the exact same fields with an existing spec in the table. To give an example, previously we created 3 specs for the following, now 2: ```sql ALTER TABLE t OPTION (ADD partition_by='bucket(10,a)'); .. ALTER TABLE t OPTION (ADD partition_by='bucket(20,a)'); .. -- back to 10, this does not generate a new spec anymore ALTER TABLE t OPTION (ADD partition_by='bucket(10,a)'); ``` On Polaris, before this commit, we'd get the following, given Polaris thinks this spec already exists: ``` alter table t (set partition_by 'bucket(10,a)'); WARNING: HTTP request failed (HTTP 400) DETAIL: Cannot set last added spec: no spec has been added HINT: The rest catalog returned error type: ValidationException ``` Signed-off-by: Onder KALACI <[email protected]>

In this PR, we add support for writable REST catalog tables. The API may still change, so I refrain documenting it in the PR description. Once we finalise the APIs, the best way would be to put under https://github.com/Snowflake-Labs/pg_lake/blob/main/docs/iceberg-tables.md, so stay tuned for that. In the earlier set of PRs (#47, #49, #51, #52 and #56) we prototyped adding support for writable rest catalog in different stages. However, it turns out that a single PR is better to tackle these very much related commits. It seemed overkill to maintain these set of PRs. This new table type shares almost the same architecture with iceberg tables `catalog=postgres`. #### Similarities - Tables are tracked the `pg_lake` catalogs such as `lake_table.files`, `lake_table.data_file_column_stats` and `lake_iceberg.tables_internal`. - All metadata handling follows `ApplyIcebergMetadataChanges()` logic. Instead of generating a new `metadata.json` as we do for `catalog=postgres`, for these tables we collect the changes happened in the transaction, and apply to the REST catalog right after it is committed in Postgres. #### Differences - The `metadata_location` column in `lake_iceberg.tables_internal` is always `NULL` - Does not support RENAME TABLE / SET SCHEMA etc. #### Some other notes on the implementation & design: - We first `COMMIT` in Postgres, then in `post-commit` hook, send a `POST` request to REST catalog. So, it is possible that the changes are committed in Postgres, but not in REST catalog. This is a known limitation, and we'll have follow-up PRs to make sure we can recover from this situation. - Creating a table and modifying it in the same Postgres transaction cannot be committed atomically in REST catalog. There is no such API in REST catalog. So, there are some additional error scenarios where table creation committed in REST catalog, say not the full CTAS. This is an unfortunate limitation that we inherit from REST catalog APIs. - Our implementation currently assumes that the Postgres is the single-writer to this table in the REST catalog. So, a concurrent modification breaks the table from Postgres side. For now, this is the current state. We plan to improve it in the future. #### TODO: - [x] `DROP partition_by` is not working (fixed by #79) - [x] Concurrency - [x] Certain DDLs do not work (e.g., ADD COLUMN with defaults), prevent much earlier - [x] VACUUM regression tests - [x] VACUUM failures (e.g., do we clean up properly?) - [x] VACUUM (ICEBERG) - [x] auto-vacuum test - [x] Truncate test - [x] savepoint - [x] Complex TX test - [x] Column names with quotes - [x] Add column + add partition by + drop column in the same tx - [x] Tests for read from postgres / iceberg, modify REST (or the other way around) - [x] Zero column table? - [x] DROP TABLE implemented, but needs tests (e.g., create - drop in the same tx, drop table removes the metadata from rest catalog etc). - [x] `SET partition_by` to an already existing partition by is not supported in Polaris. We should skip sending such requests, instead only send `set partition_spec` alone. (fixed by #79) - [ ] Recovery after failures (e.g., re-sync the previous snapshot/DDL) [Follow-up PR needed] - [x] Cache access token, currently we fetch on every REST request interaction [Follow-up PR needed] - [x] Cancel query - [x] sequences / serial / generated columns etc. - [x] All data types - [ ] Docs [Follow-up PR needed]

sfc-gh-okalaci requested review from sfc-gh-abozkurt and sfc-gh-mslot November 21, 2025 12:03

sfc-gh-okalaci mentioned this pull request Nov 21, 2025

Writable Iceberg tables with REST catalog support #68

Merged

22 tasks

sfc-gh-abozkurt reviewed Nov 21, 2025

View reviewed changes

sfc-gh-abozkurt approved these changes Nov 21, 2025

View reviewed changes

sfc-gh-okalaci force-pushed the onder/make_partitioning_better branch from 75b4eda to 0e9913a Compare November 24, 2025 06:52

sfc-gh-okalaci merged commit c5684c6 into main Nov 24, 2025
63 checks passed

sfc-gh-okalaci deleted the onder/make_partitioning_better branch November 24, 2025 07:39

sfc-gh-okalaci mentioned this pull request Dec 1, 2025

Do not generate a new schema if already exists #89

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Do not add a partition spec if it already exists #79

Do not add a partition spec if it already exists #79

Uh oh!

sfc-gh-okalaci commented Nov 21, 2025

Uh oh!

sfc-gh-abozkurt Nov 21, 2025 •

edited

Loading

Uh oh!

sfc-gh-abozkurt Nov 21, 2025 •

edited

Loading

Uh oh!

sfc-gh-okalaci Nov 24, 2025

Uh oh!

sfc-gh-abozkurt Nov 21, 2025

Uh oh!

sfc-gh-abozkurt Nov 21, 2025

Uh oh!

sfc-gh-okalaci Nov 24, 2025

Uh oh!

sfc-gh-abozkurt left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Do not add a partition spec if it already exists #79

Do not add a partition spec if it already exists #79

Uh oh!

Conversation

sfc-gh-okalaci commented Nov 21, 2025

Uh oh!

sfc-gh-abozkurt Nov 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

sfc-gh-abozkurt Nov 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

sfc-gh-okalaci Nov 24, 2025

Choose a reason for hiding this comment

Uh oh!

sfc-gh-abozkurt Nov 21, 2025

Choose a reason for hiding this comment

Uh oh!

sfc-gh-abozkurt Nov 21, 2025

Choose a reason for hiding this comment

Uh oh!

sfc-gh-okalaci Nov 24, 2025

Choose a reason for hiding this comment

Uh oh!

sfc-gh-abozkurt left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

sfc-gh-abozkurt Nov 21, 2025 •

edited

Loading

sfc-gh-abozkurt Nov 21, 2025 •

edited

Loading