"CREATE TABLE (REPLACE TABLE) ... AS SELECT" Support #438

sungwy · 2024-02-16T21:24:44Z

Feature Request / Improvement

In Spark SQL, we have the ability to combine the table overwrite using AS SELECT statement, with create_table or replace_table, as an atomic operation. (CTAS, RTAS)

Do we intend to support this feature with the same atomicity guarantee in PyIceberg?

Since the PyIceberg client is in charge of writing out the manifests and constructing the new table metadata, I think it is technically possible. Would we just add

as_select: pa.Table = None

as an optional parameter to create_table and replace_table and add a snapshot update with full table static overwrite to the new table metadata?

The text was updated successfully, but these errors were encountered:

Fokko · 2024-02-22T17:12:47Z

Duplicate of #281

sungwy · 2024-02-22T17:39:11Z

Duplicate of #281

This was my bad attempt at decoupling the introduction of REPALCE TABLE support from the discussion of how we should support ... AS SELECT semantic - but I think it's a bit too late for that now given how I've described the previous issue #281. I will bring these points back to the other issue.

Fokko added this to the PyIceberg 0.7.0 release milestone Feb 22, 2024

Fokko marked this as a duplicate of #281 Feb 22, 2024

sungwy closed this as completed Feb 22, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

"CREATE TABLE (REPLACE TABLE) ... AS SELECT" Support #438

"CREATE TABLE (REPLACE TABLE) ... AS SELECT" Support #438

sungwy commented Feb 16, 2024 •

edited

Loading

Fokko commented Feb 22, 2024

sungwy commented Feb 22, 2024

"CREATE TABLE (REPLACE TABLE) ... AS SELECT" Support #438

"CREATE TABLE (REPLACE TABLE) ... AS SELECT" Support #438

Comments

sungwy commented Feb 16, 2024 • edited Loading

Feature Request / Improvement

Fokko commented Feb 22, 2024

sungwy commented Feb 22, 2024

sungwy commented Feb 16, 2024 •

edited

Loading